{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "e183c605",
   "metadata": {},
   "source": [
    "# MLP-Mixer 创新点与争议总结\n",
    "\n",
    "## 主要创新点\n",
    "\n",
    "### 1. **完全摒弃卷积与注意力机制**\n",
    "- 首次提出仅依赖MLP构建视觉模型的架构，无需传统CNN的卷积操作或Transformer的自注意力机制。\n",
    "- 核心思想：通过两种MLP层实现**通道混合（Channel Mixing）**和**Token混合（Token Mixing）**，分别处理特征通道间的关系和空间位置间的交互。\n",
    "\n",
    "### 2. **高效的混合操作设计**\n",
    "- **Token Mixing MLP**：在图像块（patch）序列上进行跨空间位置的信息交互，模拟卷积或注意力的空间建模能力。可以认为一个Global Convolution.\n",
    "- **Channel Mixing MLP**：对每个空间位置的特征通道进行非线性变换，实现类似通道注意力的效果。可以认为是1x1卷积.\n",
    "- 结合残差连接、层归一化和Dropout，增强模型稳定性和表达能力。\n",
    "\n",
    "### 3. **简化模型复杂度**\n",
    "- 仅依赖矩阵乘法等基础操作，降低硬件适配难度，理论上更易部署到轻量化场景。\n",
    "- 在ImageNet上达到SOTA性能，验证了纯MLP架构的可行性。\n",
    "\n",
    "---\n",
    "\n",
    "  ![alt text](resources/mlp_mixer.png \"Title\")\n",
    "\n",
    "## 争议与局限性\n",
    "\n",
    "### 1. **静态特征处理的局限性**\n",
    "- MLP的固定权重模式导致其对动态特征建模能力较弱，相比注意力机制的动态权重分配，可能限制复杂场景的适应性。\n",
    "- 缺乏对长距离依赖关系的灵活建模能力（如Transformer的全局注意力）。\n",
    "\n",
    "### 2. **训练难度与资源消耗**\n",
    "- 模型参数量和计算量较大，训练过程需要大量数据和算力支持，小规模实验容易失败。\n",
    "- 对超参数敏感，调参难度较高，被质疑“非赌徒心态难以复现”。\n",
    "\n",
    "### 3. **泛化能力存疑**\n",
    "- 在ImageNet等大型数据集表现优异，但在小数据或复杂任务（如目标检测、分割）中性能下降明显，通用性受限。\n",
    "- 部分观点认为其成功依赖于对训练数据分布的过拟合，而非真正突破网络结构瓶颈。\n",
    "\n",
    "---\n",
    "\n",
    "## 总结与意义\n",
    "MLP-Mixer通过纯MLP架构挑战了CNN和Transformer的视觉模型范式，证明了基础操作（矩阵乘法）的组合潜力。尽管存在争议，但它为模型设计提供了新思路——例如后续研究结合MLP与注意力机制的混合架构。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "de2f9895",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Use device:  cuda\n"
     ]
    }
   ],
   "source": [
    "# 自动重新加载外部module，使得修改代码之后无需重新import\n",
    "# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython\n",
    "%load_ext autoreload\n",
    "%autoreload 2\n",
    "\n",
    "from hdd.device.utils import get_device\n",
    "\n",
    "import torch\n",
    "import torch.nn as nn\n",
    "import torch.optim as optim\n",
    "from torchvision import datasets, transforms\n",
    "\n",
    "# 设置训练数据的路径\n",
    "DATA_ROOT = \"~/workspace/hands-dirty-on-dl/dataset\"\n",
    "# 设置TensorBoard的路径\n",
    "TENSORBOARD_ROOT = \"~/workspace/hands-dirty-on-dl/dataset\"\n",
    "# 设置预训练模型参数路径\n",
    "TORCH_HUB_PATH = \"~/workspace/hands-dirty-on-dl/pretrained_models\"\n",
    "torch.hub.set_dir(TORCH_HUB_PATH)\n",
    "# 挑选最合适的训练设备\n",
    "DEVICE = get_device([\"cuda\", \"cpu\"])\n",
    "print(\"Use device: \", DEVICE)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "56bd73e8",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Files already downloaded and verified\n",
      "Files already downloaded and verified\n"
     ]
    }
   ],
   "source": [
    "from hdd.data_util.auto_augmentation import CIFAR10Policy\n",
    "\n",
    "# 训练超参数和数据增强来自 https://github.com/omihub777/ViT-CIFAR\n",
    "CIFAR_10_MEAN = [0.4914, 0.4822, 0.4465]\n",
    "CIFAR_10_STD = [0.2470, 0.2435, 0.2616]\n",
    "BATCH_SIZE = 256\n",
    "\n",
    "val_transform = transforms.Compose(\n",
    "    [\n",
    "        transforms.ToTensor(),\n",
    "        transforms.Normalize(CIFAR_10_MEAN, CIFAR_10_STD),\n",
    "    ]\n",
    ")\n",
    "\n",
    "val_dataloader = torch.utils.data.DataLoader(\n",
    "    datasets.CIFAR10(\n",
    "        root=DATA_ROOT, train=False, download=True, transform=val_transform\n",
    "    ),\n",
    "    batch_size=BATCH_SIZE,\n",
    "    shuffle=False,\n",
    "    num_workers=8,\n",
    "    pin_memory=True,\n",
    ")\n",
    "\n",
    "train_transform = transforms.Compose(\n",
    "    [\n",
    "        transforms.RandomCrop(size=32, padding=4),\n",
    "        transforms.RandomHorizontalFlip(),\n",
    "        CIFAR10Policy(),\n",
    "        transforms.ToTensor(),\n",
    "        transforms.Normalize(CIFAR_10_MEAN, CIFAR_10_STD),\n",
    "    ]\n",
    ")\n",
    "\n",
    "train_dataloader = torch.utils.data.DataLoader(\n",
    "    datasets.CIFAR10(\n",
    "        root=DATA_ROOT, train=True, download=True, transform=train_transform\n",
    "    ),\n",
    "    batch_size=BATCH_SIZE,\n",
    "    shuffle=True,\n",
    "    num_workers=8,\n",
    "    pin_memory=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "9e6088ec",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "#Parameter: 1099146\n",
      "Epoch: 1/800 Train Loss: 2.3554 Accuracy: 0.1019 Time: 2.20599  | Val Loss: 2.3622 Accuracy: 0.1092\n",
      "Epoch: 2/800 Train Loss: 2.1206 Accuracy: 0.2278 Time: 1.98097  | Val Loss: 1.8754 Accuracy: 0.3405\n",
      "Epoch: 3/800 Train Loss: 1.9304 Accuracy: 0.3281 Time: 1.87739  | Val Loss: 1.6786 Accuracy: 0.4550\n",
      "Epoch: 4/800 Train Loss: 1.7973 Accuracy: 0.3991 Time: 1.97301  | Val Loss: 1.5359 Accuracy: 0.5199\n",
      "Epoch: 5/800 Train Loss: 1.7024 Accuracy: 0.4444 Time: 1.95160  | Val Loss: 1.4563 Accuracy: 0.5715\n",
      "Epoch: 6/800 Train Loss: 1.6303 Accuracy: 0.4820 Time: 1.85779  | Val Loss: 1.4071 Accuracy: 0.5863\n",
      "Epoch: 7/800 Train Loss: 1.5478 Accuracy: 0.5205 Time: 1.91336  | Val Loss: 1.2997 Accuracy: 0.6432\n",
      "Epoch: 8/800 Train Loss: 1.4955 Accuracy: 0.5473 Time: 1.95487  | Val Loss: 1.2688 Accuracy: 0.6539\n",
      "Epoch: 9/800 Train Loss: 1.4409 Accuracy: 0.5756 Time: 1.88831  | Val Loss: 1.2391 Accuracy: 0.6705\n",
      "Epoch: 10/800 Train Loss: 1.4052 Accuracy: 0.5885 Time: 1.92552  | Val Loss: 1.1922 Accuracy: 0.6884\n",
      "Epoch: 11/800 Train Loss: 1.3681 Accuracy: 0.6078 Time: 1.93861  | Val Loss: 1.1528 Accuracy: 0.7081\n",
      "Epoch: 12/800 Train Loss: 1.3396 Accuracy: 0.6221 Time: 1.91081  | Val Loss: 1.1414 Accuracy: 0.7151\n",
      "Epoch: 13/800 Train Loss: 1.3141 Accuracy: 0.6314 Time: 1.93845  | Val Loss: 1.1202 Accuracy: 0.7221\n",
      "Epoch: 14/800 Train Loss: 1.2855 Accuracy: 0.6437 Time: 1.87163  | Val Loss: 1.1263 Accuracy: 0.7199\n",
      "Epoch: 15/800 Train Loss: 1.2600 Accuracy: 0.6556 Time: 1.90949  | Val Loss: 1.0738 Accuracy: 0.7471\n",
      "Epoch: 16/800 Train Loss: 1.2349 Accuracy: 0.6671 Time: 2.16460  | Val Loss: 1.0548 Accuracy: 0.7520\n",
      "Epoch: 17/800 Train Loss: 1.2129 Accuracy: 0.6784 Time: 1.97898  | Val Loss: 1.0615 Accuracy: 0.7520\n",
      "Epoch: 18/800 Train Loss: 1.1997 Accuracy: 0.6845 Time: 2.07845  | Val Loss: 1.0253 Accuracy: 0.7640\n",
      "Epoch: 19/800 Train Loss: 1.1867 Accuracy: 0.6902 Time: 1.87434  | Val Loss: 1.0130 Accuracy: 0.7651\n",
      "Epoch: 20/800 Train Loss: 1.1579 Accuracy: 0.7037 Time: 1.87367  | Val Loss: 1.0033 Accuracy: 0.7703\n",
      "Epoch: 21/800 Train Loss: 1.1481 Accuracy: 0.7061 Time: 1.91694  | Val Loss: 0.9985 Accuracy: 0.7758\n",
      "Epoch: 22/800 Train Loss: 1.1374 Accuracy: 0.7130 Time: 2.03457  | Val Loss: 0.9986 Accuracy: 0.7731\n",
      "Epoch: 23/800 Train Loss: 1.1195 Accuracy: 0.7200 Time: 2.11251  | Val Loss: 1.0037 Accuracy: 0.7770\n",
      "Epoch: 24/800 Train Loss: 1.1095 Accuracy: 0.7251 Time: 2.01546  | Val Loss: 0.9757 Accuracy: 0.7883\n",
      "Epoch: 25/800 Train Loss: 1.0978 Accuracy: 0.7277 Time: 1.97615  | Val Loss: 0.9623 Accuracy: 0.7974\n",
      "Epoch: 26/800 Train Loss: 1.0847 Accuracy: 0.7363 Time: 2.08166  | Val Loss: 0.9725 Accuracy: 0.7869\n",
      "Epoch: 27/800 Train Loss: 1.0742 Accuracy: 0.7422 Time: 2.02930  | Val Loss: 0.9533 Accuracy: 0.7987\n",
      "Epoch: 28/800 Train Loss: 1.0657 Accuracy: 0.7451 Time: 2.03199  | Val Loss: 0.9665 Accuracy: 0.7905\n",
      "Epoch: 29/800 Train Loss: 1.0533 Accuracy: 0.7489 Time: 2.00923  | Val Loss: 0.9551 Accuracy: 0.7990\n",
      "Epoch: 30/800 Train Loss: 1.0432 Accuracy: 0.7562 Time: 1.88131  | Val Loss: 0.9445 Accuracy: 0.8031\n",
      "Epoch: 31/800 Train Loss: 1.0329 Accuracy: 0.7606 Time: 1.90552  | Val Loss: 0.9456 Accuracy: 0.8000\n",
      "Epoch: 32/800 Train Loss: 1.0264 Accuracy: 0.7647 Time: 1.88570  | Val Loss: 0.9168 Accuracy: 0.8133\n",
      "Epoch: 33/800 Train Loss: 1.0186 Accuracy: 0.7666 Time: 1.95376  | Val Loss: 0.9131 Accuracy: 0.8184\n",
      "Epoch: 34/800 Train Loss: 1.0058 Accuracy: 0.7724 Time: 1.97860  | Val Loss: 0.9321 Accuracy: 0.8101\n",
      "Epoch: 35/800 Train Loss: 1.0001 Accuracy: 0.7758 Time: 1.98517  | Val Loss: 0.9141 Accuracy: 0.8145\n",
      "Epoch: 36/800 Train Loss: 0.9947 Accuracy: 0.7788 Time: 1.97172  | Val Loss: 0.9142 Accuracy: 0.8131\n",
      "Epoch: 37/800 Train Loss: 0.9853 Accuracy: 0.7817 Time: 2.05620  | Val Loss: 0.9029 Accuracy: 0.8191\n",
      "Epoch: 38/800 Train Loss: 0.9795 Accuracy: 0.7843 Time: 1.95546  | Val Loss: 0.9051 Accuracy: 0.8183\n",
      "Epoch: 39/800 Train Loss: 0.9724 Accuracy: 0.7879 Time: 2.07569  | Val Loss: 0.9003 Accuracy: 0.8206\n",
      "Epoch: 40/800 Train Loss: 0.9714 Accuracy: 0.7902 Time: 1.98988  | Val Loss: 0.8849 Accuracy: 0.8321\n",
      "Epoch: 41/800 Train Loss: 0.9523 Accuracy: 0.7971 Time: 1.98146  | Val Loss: 0.8864 Accuracy: 0.8270\n",
      "Epoch: 42/800 Train Loss: 0.9493 Accuracy: 0.8003 Time: 1.90258  | Val Loss: 0.8928 Accuracy: 0.8245\n",
      "Epoch: 43/800 Train Loss: 0.9418 Accuracy: 0.7998 Time: 1.96519  | Val Loss: 0.8904 Accuracy: 0.8264\n",
      "Epoch: 44/800 Train Loss: 0.9374 Accuracy: 0.8042 Time: 1.96770  | Val Loss: 0.8861 Accuracy: 0.8298\n",
      "Epoch: 45/800 Train Loss: 0.9325 Accuracy: 0.8068 Time: 1.96311  | Val Loss: 0.8865 Accuracy: 0.8297\n",
      "Epoch: 46/800 Train Loss: 0.9247 Accuracy: 0.8112 Time: 1.98262  | Val Loss: 0.9164 Accuracy: 0.8171\n",
      "Epoch: 47/800 Train Loss: 0.9252 Accuracy: 0.8086 Time: 1.96571  | Val Loss: 0.8816 Accuracy: 0.8353\n",
      "Epoch: 48/800 Train Loss: 0.9158 Accuracy: 0.8134 Time: 1.90112  | Val Loss: 0.8812 Accuracy: 0.8340\n",
      "Epoch: 49/800 Train Loss: 0.9150 Accuracy: 0.8146 Time: 1.96294  | Val Loss: 0.8925 Accuracy: 0.8308\n",
      "Epoch: 50/800 Train Loss: 0.9037 Accuracy: 0.8190 Time: 1.92988  | Val Loss: 0.8640 Accuracy: 0.8401\n",
      "Epoch: 51/800 Train Loss: 0.9021 Accuracy: 0.8208 Time: 1.93774  | Val Loss: 0.8618 Accuracy: 0.8452\n",
      "Epoch: 52/800 Train Loss: 0.8934 Accuracy: 0.8238 Time: 1.94781  | Val Loss: 0.8840 Accuracy: 0.8359\n",
      "Epoch: 53/800 Train Loss: 0.8919 Accuracy: 0.8245 Time: 1.99117  | Val Loss: 0.8720 Accuracy: 0.8348\n",
      "Epoch: 54/800 Train Loss: 0.8831 Accuracy: 0.8294 Time: 1.98273  | Val Loss: 0.8715 Accuracy: 0.8352\n",
      "Epoch: 55/800 Train Loss: 0.8818 Accuracy: 0.8296 Time: 2.00448  | Val Loss: 0.8661 Accuracy: 0.8397\n",
      "Epoch: 56/800 Train Loss: 0.8767 Accuracy: 0.8311 Time: 1.93108  | Val Loss: 0.8653 Accuracy: 0.8435\n",
      "Epoch: 57/800 Train Loss: 0.8722 Accuracy: 0.8332 Time: 2.00494  | Val Loss: 0.8622 Accuracy: 0.8415\n",
      "Epoch: 58/800 Train Loss: 0.8731 Accuracy: 0.8344 Time: 2.00064  | Val Loss: 0.8828 Accuracy: 0.8348\n",
      "Epoch: 59/800 Train Loss: 0.8709 Accuracy: 0.8351 Time: 1.96549  | Val Loss: 0.8733 Accuracy: 0.8384\n",
      "Epoch: 60/800 Train Loss: 0.8612 Accuracy: 0.8389 Time: 1.92791  | Val Loss: 0.8673 Accuracy: 0.8443\n",
      "Epoch: 61/800 Train Loss: 0.8596 Accuracy: 0.8411 Time: 1.97354  | Val Loss: 0.8847 Accuracy: 0.8343\n",
      "Epoch: 62/800 Train Loss: 0.8580 Accuracy: 0.8405 Time: 1.86170  | Val Loss: 0.8491 Accuracy: 0.8463\n",
      "Epoch: 63/800 Train Loss: 0.8553 Accuracy: 0.8426 Time: 1.86285  | Val Loss: 0.8549 Accuracy: 0.8465\n",
      "Epoch: 64/800 Train Loss: 0.8497 Accuracy: 0.8442 Time: 1.95440  | Val Loss: 0.8714 Accuracy: 0.8412\n",
      "Epoch: 65/800 Train Loss: 0.8450 Accuracy: 0.8463 Time: 1.91826  | Val Loss: 0.8496 Accuracy: 0.8476\n",
      "Epoch: 66/800 Train Loss: 0.8413 Accuracy: 0.8482 Time: 1.92229  | Val Loss: 0.8381 Accuracy: 0.8532\n",
      "Epoch: 67/800 Train Loss: 0.8330 Accuracy: 0.8496 Time: 1.94239  | Val Loss: 0.8570 Accuracy: 0.8482\n",
      "Epoch: 68/800 Train Loss: 0.8254 Accuracy: 0.8556 Time: 1.91547  | Val Loss: 0.8549 Accuracy: 0.8493\n",
      "Epoch: 69/800 Train Loss: 0.8298 Accuracy: 0.8530 Time: 1.87990  | Val Loss: 0.8548 Accuracy: 0.8475\n",
      "Epoch: 70/800 Train Loss: 0.8311 Accuracy: 0.8516 Time: 1.94518  | Val Loss: 0.8550 Accuracy: 0.8482\n",
      "Epoch: 71/800 Train Loss: 0.8256 Accuracy: 0.8537 Time: 1.99593  | Val Loss: 0.8714 Accuracy: 0.8457\n",
      "Epoch: 72/800 Train Loss: 0.8185 Accuracy: 0.8580 Time: 1.88434  | Val Loss: 0.8379 Accuracy: 0.8552\n",
      "Epoch: 73/800 Train Loss: 0.8167 Accuracy: 0.8606 Time: 1.87327  | Val Loss: 0.8400 Accuracy: 0.8522\n",
      "Epoch: 74/800 Train Loss: 0.8181 Accuracy: 0.8590 Time: 1.93901  | Val Loss: 0.8591 Accuracy: 0.8479\n",
      "Epoch: 75/800 Train Loss: 0.8129 Accuracy: 0.8611 Time: 1.90217  | Val Loss: 0.8447 Accuracy: 0.8500\n",
      "Epoch: 76/800 Train Loss: 0.8113 Accuracy: 0.8615 Time: 1.97472  | Val Loss: 0.8399 Accuracy: 0.8531\n",
      "Epoch: 77/800 Train Loss: 0.8089 Accuracy: 0.8640 Time: 1.96138  | Val Loss: 0.8412 Accuracy: 0.8528\n",
      "Epoch: 78/800 Train Loss: 0.8008 Accuracy: 0.8674 Time: 2.01731  | Val Loss: 0.8562 Accuracy: 0.8486\n",
      "Epoch: 79/800 Train Loss: 0.8001 Accuracy: 0.8669 Time: 1.86306  | Val Loss: 0.8556 Accuracy: 0.8536\n",
      "Epoch: 80/800 Train Loss: 0.7995 Accuracy: 0.8677 Time: 1.95512  | Val Loss: 0.8416 Accuracy: 0.8516\n",
      "Epoch: 81/800 Train Loss: 0.7970 Accuracy: 0.8687 Time: 1.89661  | Val Loss: 0.8337 Accuracy: 0.8602\n",
      "Epoch: 82/800 Train Loss: 0.7994 Accuracy: 0.8661 Time: 1.94552  | Val Loss: 0.8533 Accuracy: 0.8514\n",
      "Epoch: 83/800 Train Loss: 0.7911 Accuracy: 0.8704 Time: 1.98759  | Val Loss: 0.8702 Accuracy: 0.8450\n",
      "Epoch: 84/800 Train Loss: 0.7951 Accuracy: 0.8688 Time: 1.91543  | Val Loss: 0.8413 Accuracy: 0.8570\n",
      "Epoch: 85/800 Train Loss: 0.7877 Accuracy: 0.8729 Time: 1.95552  | Val Loss: 0.8575 Accuracy: 0.8458\n",
      "Epoch: 86/800 Train Loss: 0.7851 Accuracy: 0.8718 Time: 1.99335  | Val Loss: 0.8491 Accuracy: 0.8527\n",
      "Epoch: 87/800 Train Loss: 0.7797 Accuracy: 0.8760 Time: 1.95190  | Val Loss: 0.8433 Accuracy: 0.8533\n",
      "Epoch: 88/800 Train Loss: 0.7835 Accuracy: 0.8746 Time: 1.96973  | Val Loss: 0.8477 Accuracy: 0.8536\n",
      "Epoch: 89/800 Train Loss: 0.7757 Accuracy: 0.8774 Time: 1.90700  | Val Loss: 0.8650 Accuracy: 0.8488\n",
      "Epoch: 90/800 Train Loss: 0.7805 Accuracy: 0.8761 Time: 1.95512  | Val Loss: 0.8549 Accuracy: 0.8518\n",
      "Epoch: 91/800 Train Loss: 0.7769 Accuracy: 0.8782 Time: 1.97952  | Val Loss: 0.8566 Accuracy: 0.8519\n",
      "Epoch: 92/800 Train Loss: 0.7708 Accuracy: 0.8798 Time: 1.86989  | Val Loss: 0.8496 Accuracy: 0.8521\n",
      "Epoch: 93/800 Train Loss: 0.7689 Accuracy: 0.8811 Time: 1.98061  | Val Loss: 0.8648 Accuracy: 0.8515\n",
      "Epoch: 94/800 Train Loss: 0.7687 Accuracy: 0.8815 Time: 1.86119  | Val Loss: 0.8406 Accuracy: 0.8571\n",
      "Epoch: 95/800 Train Loss: 0.7649 Accuracy: 0.8828 Time: 1.97850  | Val Loss: 0.8463 Accuracy: 0.8558\n",
      "Epoch: 96/800 Train Loss: 0.7649 Accuracy: 0.8840 Time: 1.88979  | Val Loss: 0.8350 Accuracy: 0.8571\n",
      "Epoch: 97/800 Train Loss: 0.7640 Accuracy: 0.8834 Time: 1.90954  | Val Loss: 0.8274 Accuracy: 0.8652\n",
      "Epoch: 98/800 Train Loss: 0.7591 Accuracy: 0.8860 Time: 1.92062  | Val Loss: 0.8517 Accuracy: 0.8547\n",
      "Epoch: 99/800 Train Loss: 0.7584 Accuracy: 0.8865 Time: 2.02619  | Val Loss: 0.8646 Accuracy: 0.8538\n",
      "Epoch: 100/800 Train Loss: 0.7573 Accuracy: 0.8852 Time: 2.04846  | Val Loss: 0.8474 Accuracy: 0.8557\n",
      "Epoch: 101/800 Train Loss: 0.7599 Accuracy: 0.8846 Time: 2.02890  | Val Loss: 0.8419 Accuracy: 0.8613\n",
      "Epoch: 102/800 Train Loss: 0.7554 Accuracy: 0.8872 Time: 1.94569  | Val Loss: 0.8516 Accuracy: 0.8561\n",
      "Epoch: 103/800 Train Loss: 0.7509 Accuracy: 0.8905 Time: 1.85953  | Val Loss: 0.8406 Accuracy: 0.8569\n",
      "Epoch: 104/800 Train Loss: 0.7470 Accuracy: 0.8915 Time: 1.93320  | Val Loss: 0.8512 Accuracy: 0.8593\n",
      "Epoch: 105/800 Train Loss: 0.7491 Accuracy: 0.8900 Time: 1.98480  | Val Loss: 0.8451 Accuracy: 0.8597\n",
      "Epoch: 106/800 Train Loss: 0.7454 Accuracy: 0.8924 Time: 2.07316  | Val Loss: 0.8437 Accuracy: 0.8623\n",
      "Epoch: 107/800 Train Loss: 0.7463 Accuracy: 0.8911 Time: 2.13066  | Val Loss: 0.8468 Accuracy: 0.8596\n",
      "Epoch: 108/800 Train Loss: 0.7451 Accuracy: 0.8911 Time: 2.09719  | Val Loss: 0.8436 Accuracy: 0.8593\n",
      "Epoch: 109/800 Train Loss: 0.7432 Accuracy: 0.8924 Time: 2.02986  | Val Loss: 0.8479 Accuracy: 0.8608\n",
      "Epoch: 110/800 Train Loss: 0.7395 Accuracy: 0.8952 Time: 1.89558  | Val Loss: 0.8549 Accuracy: 0.8543\n",
      "Epoch: 111/800 Train Loss: 0.7415 Accuracy: 0.8937 Time: 1.94083  | Val Loss: 0.8415 Accuracy: 0.8579\n",
      "Epoch: 112/800 Train Loss: 0.7372 Accuracy: 0.8957 Time: 1.90609  | Val Loss: 0.8402 Accuracy: 0.8597\n",
      "Epoch: 113/800 Train Loss: 0.7362 Accuracy: 0.8969 Time: 1.86893  | Val Loss: 0.8381 Accuracy: 0.8611\n",
      "Epoch: 114/800 Train Loss: 0.7350 Accuracy: 0.8959 Time: 1.89741  | Val Loss: 0.8549 Accuracy: 0.8557\n",
      "Epoch: 115/800 Train Loss: 0.7311 Accuracy: 0.8984 Time: 1.95415  | Val Loss: 0.8493 Accuracy: 0.8609\n",
      "Epoch: 116/800 Train Loss: 0.7326 Accuracy: 0.8983 Time: 1.87852  | Val Loss: 0.8523 Accuracy: 0.8571\n",
      "Epoch: 117/800 Train Loss: 0.7317 Accuracy: 0.8985 Time: 1.87692  | Val Loss: 0.8375 Accuracy: 0.8612\n",
      "Epoch: 118/800 Train Loss: 0.7298 Accuracy: 0.8986 Time: 1.90884  | Val Loss: 0.8420 Accuracy: 0.8631\n",
      "Epoch: 119/800 Train Loss: 0.7281 Accuracy: 0.9005 Time: 1.99418  | Val Loss: 0.8447 Accuracy: 0.8648\n",
      "Epoch: 120/800 Train Loss: 0.7341 Accuracy: 0.8979 Time: 1.92614  | Val Loss: 0.8532 Accuracy: 0.8568\n",
      "Epoch: 121/800 Train Loss: 0.7287 Accuracy: 0.8980 Time: 1.96870  | Val Loss: 0.8506 Accuracy: 0.8536\n",
      "Epoch: 122/800 Train Loss: 0.7287 Accuracy: 0.8977 Time: 1.88792  | Val Loss: 0.8452 Accuracy: 0.8610\n",
      "Epoch: 123/800 Train Loss: 0.7224 Accuracy: 0.9025 Time: 1.87190  | Val Loss: 0.8507 Accuracy: 0.8607\n",
      "Epoch: 124/800 Train Loss: 0.7216 Accuracy: 0.9011 Time: 1.97589  | Val Loss: 0.8530 Accuracy: 0.8611\n",
      "Epoch: 125/800 Train Loss: 0.7232 Accuracy: 0.9006 Time: 1.94984  | Val Loss: 0.8498 Accuracy: 0.8580\n",
      "Epoch: 126/800 Train Loss: 0.7229 Accuracy: 0.9023 Time: 1.90124  | Val Loss: 0.8414 Accuracy: 0.8627\n",
      "Epoch: 127/800 Train Loss: 0.7174 Accuracy: 0.9040 Time: 1.91076  | Val Loss: 0.8572 Accuracy: 0.8616\n",
      "Epoch: 128/800 Train Loss: 0.7203 Accuracy: 0.9041 Time: 1.97513  | Val Loss: 0.8401 Accuracy: 0.8613\n",
      "Epoch: 129/800 Train Loss: 0.7190 Accuracy: 0.9040 Time: 2.00039  | Val Loss: 0.8459 Accuracy: 0.8637\n",
      "Epoch: 130/800 Train Loss: 0.7194 Accuracy: 0.9043 Time: 1.90511  | Val Loss: 0.8344 Accuracy: 0.8703\n",
      "Epoch: 131/800 Train Loss: 0.7165 Accuracy: 0.9050 Time: 1.98594  | Val Loss: 0.8383 Accuracy: 0.8605\n",
      "Epoch: 132/800 Train Loss: 0.7111 Accuracy: 0.9073 Time: 1.89755  | Val Loss: 0.8443 Accuracy: 0.8599\n",
      "Epoch: 133/800 Train Loss: 0.7129 Accuracy: 0.9052 Time: 1.88181  | Val Loss: 0.8416 Accuracy: 0.8630\n",
      "Epoch: 134/800 Train Loss: 0.7112 Accuracy: 0.9066 Time: 1.89417  | Val Loss: 0.8424 Accuracy: 0.8627\n",
      "Epoch: 135/800 Train Loss: 0.7084 Accuracy: 0.9078 Time: 1.94657  | Val Loss: 0.8471 Accuracy: 0.8632\n",
      "Epoch: 136/800 Train Loss: 0.7090 Accuracy: 0.9086 Time: 1.96540  | Val Loss: 0.8332 Accuracy: 0.8648\n",
      "Epoch: 137/800 Train Loss: 0.7114 Accuracy: 0.9070 Time: 1.85754  | Val Loss: 0.8353 Accuracy: 0.8642\n",
      "Epoch: 138/800 Train Loss: 0.7062 Accuracy: 0.9104 Time: 1.88519  | Val Loss: 0.8330 Accuracy: 0.8647\n",
      "Epoch: 139/800 Train Loss: 0.7073 Accuracy: 0.9073 Time: 1.95205  | Val Loss: 0.8390 Accuracy: 0.8639\n",
      "Epoch: 140/800 Train Loss: 0.7058 Accuracy: 0.9099 Time: 1.88670  | Val Loss: 0.8465 Accuracy: 0.8610\n",
      "Epoch: 141/800 Train Loss: 0.7041 Accuracy: 0.9096 Time: 1.93311  | Val Loss: 0.8397 Accuracy: 0.8650\n",
      "Epoch: 142/800 Train Loss: 0.6987 Accuracy: 0.9137 Time: 1.94719  | Val Loss: 0.8340 Accuracy: 0.8678\n",
      "Epoch: 143/800 Train Loss: 0.7018 Accuracy: 0.9113 Time: 1.95909  | Val Loss: 0.8388 Accuracy: 0.8645\n",
      "Epoch: 144/800 Train Loss: 0.7044 Accuracy: 0.9105 Time: 1.94490  | Val Loss: 0.8445 Accuracy: 0.8636\n",
      "Epoch: 145/800 Train Loss: 0.7013 Accuracy: 0.9110 Time: 1.94732  | Val Loss: 0.8377 Accuracy: 0.8659\n",
      "Epoch: 146/800 Train Loss: 0.6988 Accuracy: 0.9108 Time: 1.95684  | Val Loss: 0.8386 Accuracy: 0.8672\n",
      "Epoch: 147/800 Train Loss: 0.7046 Accuracy: 0.9108 Time: 1.85913  | Val Loss: 0.8340 Accuracy: 0.8671\n",
      "Epoch: 148/800 Train Loss: 0.6992 Accuracy: 0.9137 Time: 1.94206  | Val Loss: 0.8352 Accuracy: 0.8659\n",
      "Epoch: 149/800 Train Loss: 0.6923 Accuracy: 0.9156 Time: 1.89680  | Val Loss: 0.8397 Accuracy: 0.8665\n",
      "Epoch: 150/800 Train Loss: 0.6956 Accuracy: 0.9141 Time: 1.87821  | Val Loss: 0.8502 Accuracy: 0.8648\n",
      "Epoch: 151/800 Train Loss: 0.6988 Accuracy: 0.9127 Time: 1.94571  | Val Loss: 0.8514 Accuracy: 0.8636\n",
      "Epoch: 152/800 Train Loss: 0.6966 Accuracy: 0.9140 Time: 1.97843  | Val Loss: 0.8349 Accuracy: 0.8691\n",
      "Epoch: 153/800 Train Loss: 0.6937 Accuracy: 0.9137 Time: 1.88789  | Val Loss: 0.8385 Accuracy: 0.8628\n",
      "Epoch: 154/800 Train Loss: 0.6973 Accuracy: 0.9131 Time: 1.98423  | Val Loss: 0.8325 Accuracy: 0.8685\n",
      "Epoch: 155/800 Train Loss: 0.6920 Accuracy: 0.9156 Time: 1.91181  | Val Loss: 0.8320 Accuracy: 0.8672\n",
      "Epoch: 156/800 Train Loss: 0.6958 Accuracy: 0.9130 Time: 1.91580  | Val Loss: 0.8399 Accuracy: 0.8653\n",
      "Epoch: 157/800 Train Loss: 0.6924 Accuracy: 0.9156 Time: 1.89635  | Val Loss: 0.8347 Accuracy: 0.8654\n",
      "Epoch: 158/800 Train Loss: 0.6916 Accuracy: 0.9154 Time: 1.89932  | Val Loss: 0.8491 Accuracy: 0.8625\n",
      "Epoch: 159/800 Train Loss: 0.6940 Accuracy: 0.9151 Time: 1.88372  | Val Loss: 0.8386 Accuracy: 0.8671\n",
      "Epoch: 160/800 Train Loss: 0.6875 Accuracy: 0.9174 Time: 1.95270  | Val Loss: 0.8507 Accuracy: 0.8607\n",
      "Epoch: 161/800 Train Loss: 0.6869 Accuracy: 0.9185 Time: 1.96310  | Val Loss: 0.8471 Accuracy: 0.8615\n",
      "Epoch: 162/800 Train Loss: 0.6867 Accuracy: 0.9180 Time: 1.88313  | Val Loss: 0.8456 Accuracy: 0.8624\n",
      "Epoch: 163/800 Train Loss: 0.6869 Accuracy: 0.9167 Time: 1.89616  | Val Loss: 0.8495 Accuracy: 0.8634\n",
      "Epoch: 164/800 Train Loss: 0.6884 Accuracy: 0.9170 Time: 1.93892  | Val Loss: 0.8450 Accuracy: 0.8653\n",
      "Epoch: 165/800 Train Loss: 0.6879 Accuracy: 0.9183 Time: 1.93608  | Val Loss: 0.8410 Accuracy: 0.8666\n",
      "Epoch: 166/800 Train Loss: 0.6878 Accuracy: 0.9175 Time: 1.87098  | Val Loss: 0.8371 Accuracy: 0.8660\n",
      "Epoch: 167/800 Train Loss: 0.6869 Accuracy: 0.9183 Time: 1.95780  | Val Loss: 0.8345 Accuracy: 0.8700\n",
      "Epoch: 168/800 Train Loss: 0.6858 Accuracy: 0.9192 Time: 1.88751  | Val Loss: 0.8357 Accuracy: 0.8681\n",
      "Epoch: 169/800 Train Loss: 0.6825 Accuracy: 0.9196 Time: 1.90707  | Val Loss: 0.8377 Accuracy: 0.8710\n",
      "Epoch: 170/800 Train Loss: 0.6784 Accuracy: 0.9218 Time: 1.88860  | Val Loss: 0.8378 Accuracy: 0.8671\n",
      "Epoch: 171/800 Train Loss: 0.6792 Accuracy: 0.9215 Time: 2.01137  | Val Loss: 0.8400 Accuracy: 0.8672\n",
      "Epoch: 172/800 Train Loss: 0.6759 Accuracy: 0.9232 Time: 1.89836  | Val Loss: 0.8481 Accuracy: 0.8660\n",
      "Epoch: 173/800 Train Loss: 0.6832 Accuracy: 0.9198 Time: 1.99673  | Val Loss: 0.8464 Accuracy: 0.8663\n",
      "Epoch: 174/800 Train Loss: 0.6787 Accuracy: 0.9214 Time: 1.96460  | Val Loss: 0.8424 Accuracy: 0.8659\n",
      "Epoch: 175/800 Train Loss: 0.6758 Accuracy: 0.9228 Time: 2.01059  | Val Loss: 0.8390 Accuracy: 0.8649\n",
      "Epoch: 176/800 Train Loss: 0.6792 Accuracy: 0.9221 Time: 1.88129  | Val Loss: 0.8408 Accuracy: 0.8627\n",
      "Epoch: 177/800 Train Loss: 0.6799 Accuracy: 0.9208 Time: 1.91035  | Val Loss: 0.8375 Accuracy: 0.8677\n",
      "Epoch: 178/800 Train Loss: 0.6781 Accuracy: 0.9228 Time: 1.97354  | Val Loss: 0.8301 Accuracy: 0.8708\n",
      "Epoch: 179/800 Train Loss: 0.6778 Accuracy: 0.9209 Time: 1.97333  | Val Loss: 0.8393 Accuracy: 0.8665\n",
      "Epoch: 180/800 Train Loss: 0.6756 Accuracy: 0.9223 Time: 1.98919  | Val Loss: 0.8355 Accuracy: 0.8676\n",
      "Epoch: 181/800 Train Loss: 0.6761 Accuracy: 0.9222 Time: 1.93551  | Val Loss: 0.8473 Accuracy: 0.8649\n",
      "Epoch: 182/800 Train Loss: 0.6715 Accuracy: 0.9244 Time: 1.96786  | Val Loss: 0.8273 Accuracy: 0.8712\n",
      "Epoch: 183/800 Train Loss: 0.6742 Accuracy: 0.9232 Time: 1.87448  | Val Loss: 0.8442 Accuracy: 0.8643\n",
      "Epoch: 184/800 Train Loss: 0.6753 Accuracy: 0.9228 Time: 1.88810  | Val Loss: 0.8274 Accuracy: 0.8729\n",
      "Epoch: 185/800 Train Loss: 0.6690 Accuracy: 0.9253 Time: 1.97579  | Val Loss: 0.8318 Accuracy: 0.8692\n",
      "Epoch: 186/800 Train Loss: 0.6696 Accuracy: 0.9259 Time: 1.86266  | Val Loss: 0.8328 Accuracy: 0.8680\n",
      "Epoch: 187/800 Train Loss: 0.6713 Accuracy: 0.9248 Time: 1.99523  | Val Loss: 0.8245 Accuracy: 0.8718\n",
      "Epoch: 188/800 Train Loss: 0.6749 Accuracy: 0.9233 Time: 1.86745  | Val Loss: 0.8345 Accuracy: 0.8726\n",
      "Epoch: 189/800 Train Loss: 0.6710 Accuracy: 0.9236 Time: 1.99101  | Val Loss: 0.8259 Accuracy: 0.8731\n",
      "Epoch: 190/800 Train Loss: 0.6716 Accuracy: 0.9246 Time: 1.91743  | Val Loss: 0.8320 Accuracy: 0.8701\n",
      "Epoch: 191/800 Train Loss: 0.6695 Accuracy: 0.9256 Time: 1.92491  | Val Loss: 0.8342 Accuracy: 0.8720\n",
      "Epoch: 192/800 Train Loss: 0.6695 Accuracy: 0.9246 Time: 1.92688  | Val Loss: 0.8316 Accuracy: 0.8727\n",
      "Epoch: 193/800 Train Loss: 0.6639 Accuracy: 0.9287 Time: 1.91189  | Val Loss: 0.8280 Accuracy: 0.8733\n",
      "Epoch: 194/800 Train Loss: 0.6693 Accuracy: 0.9261 Time: 1.88482  | Val Loss: 0.8330 Accuracy: 0.8731\n",
      "Epoch: 195/800 Train Loss: 0.6640 Accuracy: 0.9289 Time: 2.00831  | Val Loss: 0.8240 Accuracy: 0.8727\n",
      "Epoch: 196/800 Train Loss: 0.6700 Accuracy: 0.9257 Time: 1.96827  | Val Loss: 0.8311 Accuracy: 0.8702\n",
      "Epoch: 197/800 Train Loss: 0.6669 Accuracy: 0.9260 Time: 1.94390  | Val Loss: 0.8316 Accuracy: 0.8741\n",
      "Epoch: 198/800 Train Loss: 0.6640 Accuracy: 0.9279 Time: 1.92426  | Val Loss: 0.8408 Accuracy: 0.8686\n",
      "Epoch: 199/800 Train Loss: 0.6661 Accuracy: 0.9267 Time: 1.99061  | Val Loss: 0.8437 Accuracy: 0.8638\n",
      "Epoch: 200/800 Train Loss: 0.6660 Accuracy: 0.9265 Time: 1.89254  | Val Loss: 0.8213 Accuracy: 0.8722\n",
      "Epoch: 201/800 Train Loss: 0.6630 Accuracy: 0.9287 Time: 2.03697  | Val Loss: 0.8226 Accuracy: 0.8744\n",
      "Epoch: 202/800 Train Loss: 0.6629 Accuracy: 0.9292 Time: 1.92982  | Val Loss: 0.8311 Accuracy: 0.8720\n",
      "Epoch: 203/800 Train Loss: 0.6560 Accuracy: 0.9317 Time: 1.99040  | Val Loss: 0.8354 Accuracy: 0.8713\n",
      "Epoch: 204/800 Train Loss: 0.6654 Accuracy: 0.9268 Time: 1.93251  | Val Loss: 0.8282 Accuracy: 0.8693\n",
      "Epoch: 205/800 Train Loss: 0.6635 Accuracy: 0.9276 Time: 1.98908  | Val Loss: 0.8261 Accuracy: 0.8731\n",
      "Epoch: 206/800 Train Loss: 0.6631 Accuracy: 0.9286 Time: 1.91512  | Val Loss: 0.8298 Accuracy: 0.8733\n",
      "Epoch: 207/800 Train Loss: 0.6613 Accuracy: 0.9298 Time: 1.95390  | Val Loss: 0.8280 Accuracy: 0.8706\n",
      "Epoch: 208/800 Train Loss: 0.6611 Accuracy: 0.9300 Time: 1.85019  | Val Loss: 0.8321 Accuracy: 0.8713\n",
      "Epoch: 209/800 Train Loss: 0.6589 Accuracy: 0.9306 Time: 1.90682  | Val Loss: 0.8297 Accuracy: 0.8697\n",
      "Epoch: 210/800 Train Loss: 0.6576 Accuracy: 0.9307 Time: 1.89002  | Val Loss: 0.8340 Accuracy: 0.8708\n",
      "Epoch: 211/800 Train Loss: 0.6583 Accuracy: 0.9305 Time: 1.89944  | Val Loss: 0.8237 Accuracy: 0.8720\n",
      "Epoch: 212/800 Train Loss: 0.6586 Accuracy: 0.9298 Time: 1.90640  | Val Loss: 0.8187 Accuracy: 0.8706\n",
      "Epoch: 213/800 Train Loss: 0.6525 Accuracy: 0.9339 Time: 1.85000  | Val Loss: 0.8318 Accuracy: 0.8723\n",
      "Epoch: 214/800 Train Loss: 0.6570 Accuracy: 0.9317 Time: 1.88546  | Val Loss: 0.8340 Accuracy: 0.8693\n",
      "Epoch: 215/800 Train Loss: 0.6615 Accuracy: 0.9298 Time: 1.88732  | Val Loss: 0.8343 Accuracy: 0.8682\n",
      "Epoch: 216/800 Train Loss: 0.6559 Accuracy: 0.9317 Time: 1.93567  | Val Loss: 0.8411 Accuracy: 0.8660\n",
      "Epoch: 217/800 Train Loss: 0.6591 Accuracy: 0.9301 Time: 1.87530  | Val Loss: 0.8310 Accuracy: 0.8739\n",
      "Epoch: 218/800 Train Loss: 0.6540 Accuracy: 0.9318 Time: 1.89041  | Val Loss: 0.8326 Accuracy: 0.8693\n",
      "Epoch: 219/800 Train Loss: 0.6527 Accuracy: 0.9331 Time: 1.90482  | Val Loss: 0.8210 Accuracy: 0.8736\n",
      "Epoch: 220/800 Train Loss: 0.6546 Accuracy: 0.9323 Time: 1.87036  | Val Loss: 0.8150 Accuracy: 0.8741\n",
      "Epoch: 221/800 Train Loss: 0.6533 Accuracy: 0.9316 Time: 1.84901  | Val Loss: 0.8263 Accuracy: 0.8707\n",
      "Epoch: 222/800 Train Loss: 0.6565 Accuracy: 0.9310 Time: 1.95475  | Val Loss: 0.8187 Accuracy: 0.8745\n",
      "Epoch: 223/800 Train Loss: 0.6526 Accuracy: 0.9328 Time: 1.94037  | Val Loss: 0.8276 Accuracy: 0.8722\n",
      "Epoch: 224/800 Train Loss: 0.6538 Accuracy: 0.9324 Time: 1.88102  | Val Loss: 0.8247 Accuracy: 0.8710\n",
      "Epoch: 225/800 Train Loss: 0.6519 Accuracy: 0.9326 Time: 1.89542  | Val Loss: 0.8280 Accuracy: 0.8718\n",
      "Epoch: 226/800 Train Loss: 0.6511 Accuracy: 0.9338 Time: 1.96899  | Val Loss: 0.8241 Accuracy: 0.8729\n",
      "Epoch: 227/800 Train Loss: 0.6528 Accuracy: 0.9326 Time: 1.99548  | Val Loss: 0.8303 Accuracy: 0.8735\n",
      "Epoch: 228/800 Train Loss: 0.6534 Accuracy: 0.9326 Time: 1.92028  | Val Loss: 0.8239 Accuracy: 0.8729\n",
      "Epoch: 229/800 Train Loss: 0.6481 Accuracy: 0.9357 Time: 1.88806  | Val Loss: 0.8277 Accuracy: 0.8701\n",
      "Epoch: 230/800 Train Loss: 0.6522 Accuracy: 0.9328 Time: 1.90022  | Val Loss: 0.8272 Accuracy: 0.8706\n",
      "Epoch: 231/800 Train Loss: 0.6507 Accuracy: 0.9336 Time: 1.86453  | Val Loss: 0.8256 Accuracy: 0.8743\n",
      "Epoch: 232/800 Train Loss: 0.6502 Accuracy: 0.9339 Time: 1.89604  | Val Loss: 0.8285 Accuracy: 0.8738\n",
      "Epoch: 233/800 Train Loss: 0.6481 Accuracy: 0.9342 Time: 1.86929  | Val Loss: 0.8315 Accuracy: 0.8743\n",
      "Epoch: 234/800 Train Loss: 0.6478 Accuracy: 0.9345 Time: 1.99687  | Val Loss: 0.8154 Accuracy: 0.8785\n",
      "Epoch: 235/800 Train Loss: 0.6478 Accuracy: 0.9351 Time: 1.87503  | Val Loss: 0.8275 Accuracy: 0.8722\n",
      "Epoch: 236/800 Train Loss: 0.6503 Accuracy: 0.9341 Time: 1.89702  | Val Loss: 0.8253 Accuracy: 0.8743\n",
      "Epoch: 237/800 Train Loss: 0.6477 Accuracy: 0.9357 Time: 1.90174  | Val Loss: 0.8303 Accuracy: 0.8745\n",
      "Epoch: 238/800 Train Loss: 0.6462 Accuracy: 0.9365 Time: 1.86612  | Val Loss: 0.8285 Accuracy: 0.8756\n",
      "Epoch: 239/800 Train Loss: 0.6453 Accuracy: 0.9358 Time: 1.87093  | Val Loss: 0.8280 Accuracy: 0.8729\n",
      "Epoch: 240/800 Train Loss: 0.6499 Accuracy: 0.9338 Time: 1.87086  | Val Loss: 0.8275 Accuracy: 0.8726\n",
      "Epoch: 241/800 Train Loss: 0.6443 Accuracy: 0.9367 Time: 1.88925  | Val Loss: 0.8327 Accuracy: 0.8750\n",
      "Epoch: 242/800 Train Loss: 0.6434 Accuracy: 0.9368 Time: 1.89259  | Val Loss: 0.8420 Accuracy: 0.8689\n",
      "Epoch: 243/800 Train Loss: 0.6437 Accuracy: 0.9369 Time: 1.87826  | Val Loss: 0.8360 Accuracy: 0.8747\n",
      "Epoch: 244/800 Train Loss: 0.6490 Accuracy: 0.9345 Time: 1.85833  | Val Loss: 0.8308 Accuracy: 0.8730\n",
      "Epoch: 245/800 Train Loss: 0.6443 Accuracy: 0.9367 Time: 1.87681  | Val Loss: 0.8350 Accuracy: 0.8744\n",
      "Epoch: 246/800 Train Loss: 0.6432 Accuracy: 0.9365 Time: 1.89802  | Val Loss: 0.8335 Accuracy: 0.8727\n",
      "Epoch: 247/800 Train Loss: 0.6443 Accuracy: 0.9362 Time: 1.87088  | Val Loss: 0.8341 Accuracy: 0.8717\n",
      "Epoch: 248/800 Train Loss: 0.6456 Accuracy: 0.9350 Time: 1.91037  | Val Loss: 0.8256 Accuracy: 0.8726\n",
      "Epoch: 249/800 Train Loss: 0.6434 Accuracy: 0.9371 Time: 1.90725  | Val Loss: 0.8215 Accuracy: 0.8786\n",
      "Epoch: 250/800 Train Loss: 0.6449 Accuracy: 0.9354 Time: 1.88957  | Val Loss: 0.8439 Accuracy: 0.8697\n",
      "Epoch: 251/800 Train Loss: 0.6418 Accuracy: 0.9379 Time: 1.95760  | Val Loss: 0.8295 Accuracy: 0.8743\n",
      "Epoch: 252/800 Train Loss: 0.6434 Accuracy: 0.9376 Time: 1.85470  | Val Loss: 0.8368 Accuracy: 0.8764\n",
      "Epoch: 253/800 Train Loss: 0.6409 Accuracy: 0.9384 Time: 1.85095  | Val Loss: 0.8398 Accuracy: 0.8708\n",
      "Epoch: 254/800 Train Loss: 0.6391 Accuracy: 0.9379 Time: 1.89687  | Val Loss: 0.8305 Accuracy: 0.8737\n",
      "Epoch: 255/800 Train Loss: 0.6427 Accuracy: 0.9373 Time: 1.89146  | Val Loss: 0.8327 Accuracy: 0.8720\n",
      "Epoch: 256/800 Train Loss: 0.6370 Accuracy: 0.9400 Time: 1.92966  | Val Loss: 0.8390 Accuracy: 0.8726\n",
      "Epoch: 257/800 Train Loss: 0.6447 Accuracy: 0.9365 Time: 1.87801  | Val Loss: 0.8255 Accuracy: 0.8744\n",
      "Epoch: 258/800 Train Loss: 0.6405 Accuracy: 0.9385 Time: 1.90098  | Val Loss: 0.8298 Accuracy: 0.8760\n",
      "Epoch: 259/800 Train Loss: 0.6432 Accuracy: 0.9366 Time: 1.91858  | Val Loss: 0.8311 Accuracy: 0.8744\n",
      "Epoch: 260/800 Train Loss: 0.6393 Accuracy: 0.9385 Time: 1.87543  | Val Loss: 0.8377 Accuracy: 0.8719\n",
      "Epoch: 261/800 Train Loss: 0.6395 Accuracy: 0.9400 Time: 1.86374  | Val Loss: 0.8359 Accuracy: 0.8722\n",
      "Epoch: 262/800 Train Loss: 0.6407 Accuracy: 0.9378 Time: 1.90167  | Val Loss: 0.8280 Accuracy: 0.8748\n",
      "Epoch: 263/800 Train Loss: 0.6413 Accuracy: 0.9380 Time: 1.90685  | Val Loss: 0.8308 Accuracy: 0.8733\n",
      "Epoch: 264/800 Train Loss: 0.6402 Accuracy: 0.9381 Time: 1.90048  | Val Loss: 0.8222 Accuracy: 0.8745\n",
      "Epoch: 265/800 Train Loss: 0.6385 Accuracy: 0.9390 Time: 1.85024  | Val Loss: 0.8386 Accuracy: 0.8742\n",
      "Epoch: 266/800 Train Loss: 0.6382 Accuracy: 0.9387 Time: 1.88657  | Val Loss: 0.8293 Accuracy: 0.8754\n",
      "Epoch: 267/800 Train Loss: 0.6376 Accuracy: 0.9399 Time: 1.88582  | Val Loss: 0.8217 Accuracy: 0.8760\n",
      "Epoch: 268/800 Train Loss: 0.6361 Accuracy: 0.9401 Time: 1.84557  | Val Loss: 0.8234 Accuracy: 0.8759\n",
      "Epoch: 269/800 Train Loss: 0.6360 Accuracy: 0.9394 Time: 1.85055  | Val Loss: 0.8240 Accuracy: 0.8769\n",
      "Epoch: 270/800 Train Loss: 0.6383 Accuracy: 0.9395 Time: 1.91269  | Val Loss: 0.8264 Accuracy: 0.8762\n",
      "Epoch: 271/800 Train Loss: 0.6353 Accuracy: 0.9405 Time: 1.93741  | Val Loss: 0.8217 Accuracy: 0.8773\n",
      "Epoch: 272/800 Train Loss: 0.6369 Accuracy: 0.9397 Time: 1.87670  | Val Loss: 0.8305 Accuracy: 0.8746\n",
      "Epoch: 273/800 Train Loss: 0.6351 Accuracy: 0.9417 Time: 1.84630  | Val Loss: 0.8206 Accuracy: 0.8763\n",
      "Epoch: 274/800 Train Loss: 0.6367 Accuracy: 0.9399 Time: 1.89559  | Val Loss: 0.8258 Accuracy: 0.8752\n",
      "Epoch: 275/800 Train Loss: 0.6350 Accuracy: 0.9412 Time: 1.86392  | Val Loss: 0.8334 Accuracy: 0.8717\n",
      "Epoch: 276/800 Train Loss: 0.6319 Accuracy: 0.9418 Time: 1.86650  | Val Loss: 0.8252 Accuracy: 0.8762\n",
      "Epoch: 277/800 Train Loss: 0.6308 Accuracy: 0.9426 Time: 1.90355  | Val Loss: 0.8228 Accuracy: 0.8780\n",
      "Epoch: 278/800 Train Loss: 0.6348 Accuracy: 0.9412 Time: 1.91643  | Val Loss: 0.8235 Accuracy: 0.8781\n",
      "Epoch: 279/800 Train Loss: 0.6335 Accuracy: 0.9415 Time: 1.87376  | Val Loss: 0.8239 Accuracy: 0.8740\n",
      "Epoch: 280/800 Train Loss: 0.6344 Accuracy: 0.9409 Time: 1.86258  | Val Loss: 0.8287 Accuracy: 0.8731\n",
      "Epoch: 281/800 Train Loss: 0.6309 Accuracy: 0.9428 Time: 2.00330  | Val Loss: 0.8357 Accuracy: 0.8714\n",
      "Epoch: 282/800 Train Loss: 0.6333 Accuracy: 0.9410 Time: 1.85141  | Val Loss: 0.8181 Accuracy: 0.8778\n",
      "Epoch: 283/800 Train Loss: 0.6324 Accuracy: 0.9417 Time: 1.93449  | Val Loss: 0.8351 Accuracy: 0.8738\n",
      "Epoch: 284/800 Train Loss: 0.6310 Accuracy: 0.9431 Time: 1.85455  | Val Loss: 0.8307 Accuracy: 0.8747\n",
      "Epoch: 285/800 Train Loss: 0.6371 Accuracy: 0.9388 Time: 1.87426  | Val Loss: 0.8211 Accuracy: 0.8767\n",
      "Epoch: 286/800 Train Loss: 0.6305 Accuracy: 0.9426 Time: 1.89581  | Val Loss: 0.8310 Accuracy: 0.8757\n",
      "Epoch: 287/800 Train Loss: 0.6308 Accuracy: 0.9424 Time: 1.93361  | Val Loss: 0.8281 Accuracy: 0.8747\n",
      "Epoch: 288/800 Train Loss: 0.6303 Accuracy: 0.9438 Time: 1.90630  | Val Loss: 0.8236 Accuracy: 0.8763\n",
      "Epoch: 289/800 Train Loss: 0.6327 Accuracy: 0.9412 Time: 1.94327  | Val Loss: 0.8203 Accuracy: 0.8760\n",
      "Epoch: 290/800 Train Loss: 0.6330 Accuracy: 0.9417 Time: 1.96191  | Val Loss: 0.8272 Accuracy: 0.8740\n",
      "Epoch: 291/800 Train Loss: 0.6295 Accuracy: 0.9432 Time: 1.88391  | Val Loss: 0.8224 Accuracy: 0.8779\n",
      "Epoch: 292/800 Train Loss: 0.6325 Accuracy: 0.9423 Time: 1.91462  | Val Loss: 0.8046 Accuracy: 0.8805\n",
      "Epoch: 293/800 Train Loss: 0.6306 Accuracy: 0.9430 Time: 1.90475  | Val Loss: 0.8247 Accuracy: 0.8793\n",
      "Epoch: 294/800 Train Loss: 0.6279 Accuracy: 0.9439 Time: 1.90975  | Val Loss: 0.8304 Accuracy: 0.8788\n",
      "Epoch: 295/800 Train Loss: 0.6289 Accuracy: 0.9426 Time: 1.89969  | Val Loss: 0.8280 Accuracy: 0.8741\n",
      "Epoch: 296/800 Train Loss: 0.6276 Accuracy: 0.9442 Time: 1.91171  | Val Loss: 0.8224 Accuracy: 0.8749\n",
      "Epoch: 297/800 Train Loss: 0.6326 Accuracy: 0.9421 Time: 1.96940  | Val Loss: 0.8180 Accuracy: 0.8780\n",
      "Epoch: 298/800 Train Loss: 0.6273 Accuracy: 0.9444 Time: 1.98231  | Val Loss: 0.8272 Accuracy: 0.8759\n",
      "Epoch: 299/800 Train Loss: 0.6251 Accuracy: 0.9450 Time: 1.86663  | Val Loss: 0.8137 Accuracy: 0.8780\n",
      "Epoch: 300/800 Train Loss: 0.6263 Accuracy: 0.9448 Time: 1.88431  | Val Loss: 0.8218 Accuracy: 0.8764\n",
      "Epoch: 301/800 Train Loss: 0.6242 Accuracy: 0.9460 Time: 1.89365  | Val Loss: 0.8228 Accuracy: 0.8781\n",
      "Epoch: 302/800 Train Loss: 0.6275 Accuracy: 0.9437 Time: 1.89885  | Val Loss: 0.8208 Accuracy: 0.8773\n",
      "Epoch: 303/800 Train Loss: 0.6264 Accuracy: 0.9440 Time: 1.92298  | Val Loss: 0.8199 Accuracy: 0.8761\n",
      "Epoch: 304/800 Train Loss: 0.6240 Accuracy: 0.9457 Time: 1.88896  | Val Loss: 0.8165 Accuracy: 0.8776\n",
      "Epoch: 305/800 Train Loss: 0.6235 Accuracy: 0.9458 Time: 1.87154  | Val Loss: 0.8238 Accuracy: 0.8791\n",
      "Epoch: 306/800 Train Loss: 0.6234 Accuracy: 0.9461 Time: 1.84722  | Val Loss: 0.8333 Accuracy: 0.8742\n",
      "Epoch: 307/800 Train Loss: 0.6218 Accuracy: 0.9460 Time: 1.86077  | Val Loss: 0.8282 Accuracy: 0.8788\n",
      "Epoch: 308/800 Train Loss: 0.6250 Accuracy: 0.9456 Time: 1.86377  | Val Loss: 0.8247 Accuracy: 0.8766\n",
      "Epoch: 309/800 Train Loss: 0.6236 Accuracy: 0.9458 Time: 1.94337  | Val Loss: 0.8211 Accuracy: 0.8789\n",
      "Epoch: 310/800 Train Loss: 0.6207 Accuracy: 0.9474 Time: 1.86735  | Val Loss: 0.8264 Accuracy: 0.8759\n",
      "Epoch: 311/800 Train Loss: 0.6243 Accuracy: 0.9454 Time: 1.87094  | Val Loss: 0.8244 Accuracy: 0.8768\n",
      "Epoch: 312/800 Train Loss: 0.6257 Accuracy: 0.9446 Time: 1.92713  | Val Loss: 0.8255 Accuracy: 0.8744\n",
      "Epoch: 313/800 Train Loss: 0.6256 Accuracy: 0.9454 Time: 1.95781  | Val Loss: 0.8285 Accuracy: 0.8747\n",
      "Epoch: 314/800 Train Loss: 0.6214 Accuracy: 0.9470 Time: 1.88594  | Val Loss: 0.8376 Accuracy: 0.8712\n",
      "Epoch: 315/800 Train Loss: 0.6268 Accuracy: 0.9439 Time: 1.88986  | Val Loss: 0.8281 Accuracy: 0.8738\n",
      "Epoch: 316/800 Train Loss: 0.6226 Accuracy: 0.9462 Time: 1.85733  | Val Loss: 0.8272 Accuracy: 0.8760\n",
      "Epoch: 317/800 Train Loss: 0.6203 Accuracy: 0.9472 Time: 1.89553  | Val Loss: 0.8256 Accuracy: 0.8774\n",
      "Epoch: 318/800 Train Loss: 0.6201 Accuracy: 0.9464 Time: 1.87501  | Val Loss: 0.8201 Accuracy: 0.8751\n",
      "Epoch: 319/800 Train Loss: 0.6195 Accuracy: 0.9478 Time: 1.86216  | Val Loss: 0.8274 Accuracy: 0.8784\n",
      "Epoch: 320/800 Train Loss: 0.6203 Accuracy: 0.9476 Time: 1.90107  | Val Loss: 0.8305 Accuracy: 0.8764\n",
      "Epoch: 321/800 Train Loss: 0.6192 Accuracy: 0.9470 Time: 1.94720  | Val Loss: 0.8291 Accuracy: 0.8744\n",
      "Epoch: 322/800 Train Loss: 0.6200 Accuracy: 0.9469 Time: 1.91324  | Val Loss: 0.8240 Accuracy: 0.8763\n",
      "Epoch: 323/800 Train Loss: 0.6194 Accuracy: 0.9475 Time: 1.89419  | Val Loss: 0.8154 Accuracy: 0.8798\n",
      "Epoch: 324/800 Train Loss: 0.6183 Accuracy: 0.9488 Time: 1.96151  | Val Loss: 0.8272 Accuracy: 0.8772\n",
      "Epoch: 325/800 Train Loss: 0.6146 Accuracy: 0.9496 Time: 1.91006  | Val Loss: 0.8168 Accuracy: 0.8800\n",
      "Epoch: 326/800 Train Loss: 0.6238 Accuracy: 0.9452 Time: 1.93176  | Val Loss: 0.8237 Accuracy: 0.8754\n",
      "Epoch: 327/800 Train Loss: 0.6210 Accuracy: 0.9469 Time: 1.96925  | Val Loss: 0.8303 Accuracy: 0.8760\n",
      "Epoch: 328/800 Train Loss: 0.6193 Accuracy: 0.9476 Time: 1.94885  | Val Loss: 0.8284 Accuracy: 0.8754\n",
      "Epoch: 329/800 Train Loss: 0.6247 Accuracy: 0.9455 Time: 1.87442  | Val Loss: 0.8311 Accuracy: 0.8758\n",
      "Epoch: 330/800 Train Loss: 0.6224 Accuracy: 0.9458 Time: 1.90185  | Val Loss: 0.8347 Accuracy: 0.8742\n",
      "Epoch: 331/800 Train Loss: 0.6179 Accuracy: 0.9472 Time: 1.88080  | Val Loss: 0.8345 Accuracy: 0.8749\n",
      "Epoch: 332/800 Train Loss: 0.6157 Accuracy: 0.9497 Time: 1.87107  | Val Loss: 0.8336 Accuracy: 0.8768\n",
      "Epoch: 333/800 Train Loss: 0.6203 Accuracy: 0.9480 Time: 1.90279  | Val Loss: 0.8144 Accuracy: 0.8809\n",
      "Epoch: 334/800 Train Loss: 0.6170 Accuracy: 0.9481 Time: 1.92365  | Val Loss: 0.8225 Accuracy: 0.8761\n",
      "Epoch: 335/800 Train Loss: 0.6155 Accuracy: 0.9495 Time: 1.87955  | Val Loss: 0.8286 Accuracy: 0.8756\n",
      "Epoch: 336/800 Train Loss: 0.6177 Accuracy: 0.9473 Time: 1.91009  | Val Loss: 0.8284 Accuracy: 0.8752\n",
      "Epoch: 337/800 Train Loss: 0.6144 Accuracy: 0.9492 Time: 1.89097  | Val Loss: 0.8294 Accuracy: 0.8770\n",
      "Epoch: 338/800 Train Loss: 0.6174 Accuracy: 0.9485 Time: 1.87632  | Val Loss: 0.8236 Accuracy: 0.8798\n",
      "Epoch: 339/800 Train Loss: 0.6177 Accuracy: 0.9481 Time: 1.87698  | Val Loss: 0.8177 Accuracy: 0.8786\n",
      "Epoch: 340/800 Train Loss: 0.6129 Accuracy: 0.9509 Time: 1.87280  | Val Loss: 0.8213 Accuracy: 0.8770\n",
      "Epoch: 341/800 Train Loss: 0.6143 Accuracy: 0.9493 Time: 1.87496  | Val Loss: 0.8151 Accuracy: 0.8810\n",
      "Epoch: 342/800 Train Loss: 0.6159 Accuracy: 0.9491 Time: 1.99424  | Val Loss: 0.8249 Accuracy: 0.8784\n",
      "Epoch: 343/800 Train Loss: 0.6141 Accuracy: 0.9502 Time: 1.86430  | Val Loss: 0.8238 Accuracy: 0.8784\n",
      "Epoch: 344/800 Train Loss: 0.6126 Accuracy: 0.9501 Time: 1.86043  | Val Loss: 0.8190 Accuracy: 0.8809\n",
      "Epoch: 345/800 Train Loss: 0.6157 Accuracy: 0.9489 Time: 1.85193  | Val Loss: 0.8222 Accuracy: 0.8774\n",
      "Epoch: 346/800 Train Loss: 0.6167 Accuracy: 0.9478 Time: 1.93128  | Val Loss: 0.8233 Accuracy: 0.8770\n",
      "Epoch: 347/800 Train Loss: 0.6171 Accuracy: 0.9489 Time: 1.87577  | Val Loss: 0.8123 Accuracy: 0.8830\n",
      "Epoch: 348/800 Train Loss: 0.6139 Accuracy: 0.9494 Time: 1.84676  | Val Loss: 0.8305 Accuracy: 0.8784\n",
      "Epoch: 349/800 Train Loss: 0.6106 Accuracy: 0.9514 Time: 1.90914  | Val Loss: 0.8226 Accuracy: 0.8809\n",
      "Epoch: 350/800 Train Loss: 0.6173 Accuracy: 0.9482 Time: 1.87874  | Val Loss: 0.8247 Accuracy: 0.8787\n",
      "Epoch: 351/800 Train Loss: 0.6148 Accuracy: 0.9488 Time: 1.87311  | Val Loss: 0.8126 Accuracy: 0.8822\n",
      "Epoch: 352/800 Train Loss: 0.6155 Accuracy: 0.9491 Time: 1.85462  | Val Loss: 0.8233 Accuracy: 0.8812\n",
      "Epoch: 353/800 Train Loss: 0.6128 Accuracy: 0.9504 Time: 1.92160  | Val Loss: 0.8190 Accuracy: 0.8818\n",
      "Epoch: 354/800 Train Loss: 0.6142 Accuracy: 0.9499 Time: 1.88696  | Val Loss: 0.8277 Accuracy: 0.8767\n",
      "Epoch: 355/800 Train Loss: 0.6167 Accuracy: 0.9489 Time: 1.90382  | Val Loss: 0.8137 Accuracy: 0.8808\n",
      "Epoch: 356/800 Train Loss: 0.6106 Accuracy: 0.9514 Time: 1.90603  | Val Loss: 0.8153 Accuracy: 0.8820\n",
      "Epoch: 357/800 Train Loss: 0.6115 Accuracy: 0.9508 Time: 1.92676  | Val Loss: 0.8178 Accuracy: 0.8797\n",
      "Epoch: 358/800 Train Loss: 0.6156 Accuracy: 0.9489 Time: 1.91448  | Val Loss: 0.8214 Accuracy: 0.8757\n",
      "Epoch: 359/800 Train Loss: 0.6144 Accuracy: 0.9498 Time: 1.94909  | Val Loss: 0.8219 Accuracy: 0.8812\n",
      "Epoch: 360/800 Train Loss: 0.6115 Accuracy: 0.9512 Time: 1.91825  | Val Loss: 0.8069 Accuracy: 0.8876\n",
      "Epoch: 361/800 Train Loss: 0.6106 Accuracy: 0.9519 Time: 1.87012  | Val Loss: 0.8120 Accuracy: 0.8822\n",
      "Epoch: 362/800 Train Loss: 0.6136 Accuracy: 0.9500 Time: 1.87056  | Val Loss: 0.8144 Accuracy: 0.8795\n",
      "Epoch: 363/800 Train Loss: 0.6122 Accuracy: 0.9501 Time: 1.87166  | Val Loss: 0.8165 Accuracy: 0.8797\n",
      "Epoch: 364/800 Train Loss: 0.6115 Accuracy: 0.9508 Time: 1.88385  | Val Loss: 0.8216 Accuracy: 0.8827\n",
      "Epoch: 365/800 Train Loss: 0.6124 Accuracy: 0.9510 Time: 1.97088  | Val Loss: 0.8138 Accuracy: 0.8793\n",
      "Epoch: 366/800 Train Loss: 0.6125 Accuracy: 0.9504 Time: 1.90354  | Val Loss: 0.8204 Accuracy: 0.8784\n",
      "Epoch: 367/800 Train Loss: 0.6095 Accuracy: 0.9516 Time: 1.92916  | Val Loss: 0.8187 Accuracy: 0.8800\n",
      "Epoch: 368/800 Train Loss: 0.6111 Accuracy: 0.9509 Time: 1.95394  | Val Loss: 0.8108 Accuracy: 0.8832\n",
      "Epoch: 369/800 Train Loss: 0.6116 Accuracy: 0.9505 Time: 1.84929  | Val Loss: 0.8116 Accuracy: 0.8835\n",
      "Epoch: 370/800 Train Loss: 0.6067 Accuracy: 0.9530 Time: 1.84919  | Val Loss: 0.8216 Accuracy: 0.8776\n",
      "Epoch: 371/800 Train Loss: 0.6085 Accuracy: 0.9521 Time: 1.85611  | Val Loss: 0.8092 Accuracy: 0.8825\n",
      "Epoch: 372/800 Train Loss: 0.6078 Accuracy: 0.9517 Time: 1.90911  | Val Loss: 0.8107 Accuracy: 0.8840\n",
      "Epoch: 373/800 Train Loss: 0.6076 Accuracy: 0.9535 Time: 1.87708  | Val Loss: 0.8217 Accuracy: 0.8802\n",
      "Epoch: 374/800 Train Loss: 0.6070 Accuracy: 0.9533 Time: 1.88274  | Val Loss: 0.8181 Accuracy: 0.8819\n",
      "Epoch: 375/800 Train Loss: 0.6073 Accuracy: 0.9527 Time: 1.93614  | Val Loss: 0.8122 Accuracy: 0.8870\n",
      "Epoch: 376/800 Train Loss: 0.6075 Accuracy: 0.9522 Time: 2.02041  | Val Loss: 0.8227 Accuracy: 0.8824\n",
      "Epoch: 377/800 Train Loss: 0.6083 Accuracy: 0.9529 Time: 2.02467  | Val Loss: 0.8074 Accuracy: 0.8842\n",
      "Epoch: 378/800 Train Loss: 0.6073 Accuracy: 0.9528 Time: 1.87706  | Val Loss: 0.8091 Accuracy: 0.8835\n",
      "Epoch: 379/800 Train Loss: 0.6072 Accuracy: 0.9531 Time: 1.86372  | Val Loss: 0.8193 Accuracy: 0.8789\n",
      "Epoch: 380/800 Train Loss: 0.6063 Accuracy: 0.9527 Time: 1.85414  | Val Loss: 0.8113 Accuracy: 0.8832\n",
      "Epoch: 381/800 Train Loss: 0.6064 Accuracy: 0.9539 Time: 1.87303  | Val Loss: 0.8252 Accuracy: 0.8801\n",
      "Epoch: 382/800 Train Loss: 0.6087 Accuracy: 0.9523 Time: 1.95386  | Val Loss: 0.8134 Accuracy: 0.8840\n",
      "Epoch: 383/800 Train Loss: 0.6069 Accuracy: 0.9517 Time: 1.86352  | Val Loss: 0.8090 Accuracy: 0.8830\n",
      "Epoch: 384/800 Train Loss: 0.6087 Accuracy: 0.9516 Time: 1.86105  | Val Loss: 0.8253 Accuracy: 0.8784\n",
      "Epoch: 385/800 Train Loss: 0.6080 Accuracy: 0.9523 Time: 1.96112  | Val Loss: 0.8235 Accuracy: 0.8792\n",
      "Epoch: 386/800 Train Loss: 0.6061 Accuracy: 0.9537 Time: 1.96942  | Val Loss: 0.8140 Accuracy: 0.8820\n",
      "Epoch: 387/800 Train Loss: 0.6073 Accuracy: 0.9525 Time: 1.87480  | Val Loss: 0.8117 Accuracy: 0.8827\n",
      "Epoch: 388/800 Train Loss: 0.6058 Accuracy: 0.9535 Time: 1.86772  | Val Loss: 0.8094 Accuracy: 0.8831\n",
      "Epoch: 389/800 Train Loss: 0.6055 Accuracy: 0.9530 Time: 1.85399  | Val Loss: 0.8171 Accuracy: 0.8781\n",
      "Epoch: 390/800 Train Loss: 0.6050 Accuracy: 0.9537 Time: 1.88057  | Val Loss: 0.8150 Accuracy: 0.8816\n",
      "Epoch: 391/800 Train Loss: 0.6064 Accuracy: 0.9531 Time: 1.89069  | Val Loss: 0.8126 Accuracy: 0.8797\n",
      "Epoch: 392/800 Train Loss: 0.6081 Accuracy: 0.9520 Time: 1.85895  | Val Loss: 0.8217 Accuracy: 0.8778\n",
      "Epoch: 393/800 Train Loss: 0.6017 Accuracy: 0.9551 Time: 1.98004  | Val Loss: 0.8100 Accuracy: 0.8822\n",
      "Epoch: 394/800 Train Loss: 0.6031 Accuracy: 0.9543 Time: 1.97516  | Val Loss: 0.8072 Accuracy: 0.8832\n",
      "Epoch: 395/800 Train Loss: 0.6060 Accuracy: 0.9529 Time: 1.91467  | Val Loss: 0.8212 Accuracy: 0.8797\n",
      "Epoch: 396/800 Train Loss: 0.6040 Accuracy: 0.9551 Time: 1.85335  | Val Loss: 0.8165 Accuracy: 0.8823\n",
      "Epoch: 397/800 Train Loss: 0.6026 Accuracy: 0.9553 Time: 1.91051  | Val Loss: 0.8104 Accuracy: 0.8836\n",
      "Epoch: 398/800 Train Loss: 0.6052 Accuracy: 0.9531 Time: 1.95324  | Val Loss: 0.8156 Accuracy: 0.8810\n",
      "Epoch: 399/800 Train Loss: 0.6094 Accuracy: 0.9524 Time: 1.84706  | Val Loss: 0.8124 Accuracy: 0.8807\n",
      "Epoch: 400/800 Train Loss: 0.6045 Accuracy: 0.9538 Time: 1.86926  | Val Loss: 0.8094 Accuracy: 0.8821\n",
      "Epoch: 401/800 Train Loss: 0.6023 Accuracy: 0.9546 Time: 1.86822  | Val Loss: 0.8134 Accuracy: 0.8814\n",
      "Epoch: 402/800 Train Loss: 0.6014 Accuracy: 0.9551 Time: 1.90304  | Val Loss: 0.8108 Accuracy: 0.8833\n",
      "Epoch: 403/800 Train Loss: 0.6044 Accuracy: 0.9540 Time: 1.97827  | Val Loss: 0.8159 Accuracy: 0.8796\n",
      "Epoch: 404/800 Train Loss: 0.6049 Accuracy: 0.9541 Time: 1.86513  | Val Loss: 0.8137 Accuracy: 0.8797\n",
      "Epoch: 405/800 Train Loss: 0.6028 Accuracy: 0.9543 Time: 1.87610  | Val Loss: 0.8138 Accuracy: 0.8853\n",
      "Epoch: 406/800 Train Loss: 0.6018 Accuracy: 0.9547 Time: 1.86340  | Val Loss: 0.8140 Accuracy: 0.8824\n",
      "Epoch: 407/800 Train Loss: 0.6002 Accuracy: 0.9551 Time: 1.86117  | Val Loss: 0.8119 Accuracy: 0.8837\n",
      "Epoch: 408/800 Train Loss: 0.6014 Accuracy: 0.9562 Time: 1.88768  | Val Loss: 0.8190 Accuracy: 0.8832\n",
      "Epoch: 409/800 Train Loss: 0.6000 Accuracy: 0.9561 Time: 1.89366  | Val Loss: 0.8160 Accuracy: 0.8811\n",
      "Epoch: 410/800 Train Loss: 0.5996 Accuracy: 0.9556 Time: 1.93894  | Val Loss: 0.8130 Accuracy: 0.8836\n",
      "Epoch: 411/800 Train Loss: 0.6027 Accuracy: 0.9550 Time: 1.88171  | Val Loss: 0.8195 Accuracy: 0.8805\n",
      "Epoch: 412/800 Train Loss: 0.5999 Accuracy: 0.9559 Time: 1.90330  | Val Loss: 0.8261 Accuracy: 0.8802\n",
      "Epoch: 413/800 Train Loss: 0.6013 Accuracy: 0.9554 Time: 1.94841  | Val Loss: 0.8161 Accuracy: 0.8849\n",
      "Epoch: 414/800 Train Loss: 0.5995 Accuracy: 0.9569 Time: 1.91626  | Val Loss: 0.8208 Accuracy: 0.8822\n",
      "Epoch: 415/800 Train Loss: 0.6008 Accuracy: 0.9557 Time: 1.85936  | Val Loss: 0.8064 Accuracy: 0.8839\n",
      "Epoch: 416/800 Train Loss: 0.5991 Accuracy: 0.9557 Time: 1.89889  | Val Loss: 0.8246 Accuracy: 0.8804\n",
      "Epoch: 417/800 Train Loss: 0.6025 Accuracy: 0.9553 Time: 1.91334  | Val Loss: 0.8087 Accuracy: 0.8841\n",
      "Epoch: 418/800 Train Loss: 0.5992 Accuracy: 0.9569 Time: 1.89176  | Val Loss: 0.8261 Accuracy: 0.8786\n",
      "Epoch: 419/800 Train Loss: 0.6008 Accuracy: 0.9557 Time: 1.89649  | Val Loss: 0.8209 Accuracy: 0.8790\n",
      "Epoch: 420/800 Train Loss: 0.6031 Accuracy: 0.9546 Time: 1.89491  | Val Loss: 0.8030 Accuracy: 0.8836\n",
      "Epoch: 421/800 Train Loss: 0.5970 Accuracy: 0.9573 Time: 1.91274  | Val Loss: 0.8158 Accuracy: 0.8841\n",
      "Epoch: 422/800 Train Loss: 0.5972 Accuracy: 0.9575 Time: 1.94330  | Val Loss: 0.8102 Accuracy: 0.8838\n",
      "Epoch: 423/800 Train Loss: 0.5969 Accuracy: 0.9567 Time: 1.87905  | Val Loss: 0.8112 Accuracy: 0.8823\n",
      "Epoch: 424/800 Train Loss: 0.5965 Accuracy: 0.9574 Time: 1.93985  | Val Loss: 0.8100 Accuracy: 0.8824\n",
      "Epoch: 425/800 Train Loss: 0.6001 Accuracy: 0.9561 Time: 1.93261  | Val Loss: 0.8111 Accuracy: 0.8854\n",
      "Epoch: 426/800 Train Loss: 0.5979 Accuracy: 0.9575 Time: 1.88902  | Val Loss: 0.8118 Accuracy: 0.8849\n",
      "Epoch: 427/800 Train Loss: 0.5978 Accuracy: 0.9562 Time: 1.85793  | Val Loss: 0.8151 Accuracy: 0.8816\n",
      "Epoch: 428/800 Train Loss: 0.6004 Accuracy: 0.9560 Time: 1.84347  | Val Loss: 0.8117 Accuracy: 0.8831\n",
      "Epoch: 429/800 Train Loss: 0.5960 Accuracy: 0.9571 Time: 1.96088  | Val Loss: 0.8091 Accuracy: 0.8848\n",
      "Epoch: 430/800 Train Loss: 0.5997 Accuracy: 0.9554 Time: 2.00858  | Val Loss: 0.8106 Accuracy: 0.8839\n",
      "Epoch: 431/800 Train Loss: 0.5953 Accuracy: 0.9582 Time: 1.96623  | Val Loss: 0.8161 Accuracy: 0.8826\n",
      "Epoch: 432/800 Train Loss: 0.5991 Accuracy: 0.9564 Time: 1.93150  | Val Loss: 0.8097 Accuracy: 0.8833\n",
      "Epoch: 433/800 Train Loss: 0.5985 Accuracy: 0.9575 Time: 1.87926  | Val Loss: 0.8135 Accuracy: 0.8831\n",
      "Epoch: 434/800 Train Loss: 0.5956 Accuracy: 0.9572 Time: 1.87213  | Val Loss: 0.8171 Accuracy: 0.8853\n",
      "Epoch: 435/800 Train Loss: 0.5976 Accuracy: 0.9568 Time: 1.85037  | Val Loss: 0.8140 Accuracy: 0.8847\n",
      "Epoch: 436/800 Train Loss: 0.5947 Accuracy: 0.9583 Time: 1.95486  | Val Loss: 0.8205 Accuracy: 0.8820\n",
      "Epoch: 437/800 Train Loss: 0.5971 Accuracy: 0.9575 Time: 1.86148  | Val Loss: 0.8143 Accuracy: 0.8829\n",
      "Epoch: 438/800 Train Loss: 0.5936 Accuracy: 0.9591 Time: 1.85063  | Val Loss: 0.8184 Accuracy: 0.8824\n",
      "Epoch: 439/800 Train Loss: 0.5953 Accuracy: 0.9575 Time: 1.85273  | Val Loss: 0.8072 Accuracy: 0.8851\n",
      "Epoch: 440/800 Train Loss: 0.5972 Accuracy: 0.9571 Time: 1.94571  | Val Loss: 0.8138 Accuracy: 0.8825\n",
      "Epoch: 441/800 Train Loss: 0.5959 Accuracy: 0.9572 Time: 1.87953  | Val Loss: 0.8131 Accuracy: 0.8827\n",
      "Epoch: 442/800 Train Loss: 0.5925 Accuracy: 0.9591 Time: 1.99537  | Val Loss: 0.8079 Accuracy: 0.8851\n",
      "Epoch: 443/800 Train Loss: 0.5943 Accuracy: 0.9589 Time: 1.85900  | Val Loss: 0.8093 Accuracy: 0.8833\n",
      "Epoch: 444/800 Train Loss: 0.5966 Accuracy: 0.9577 Time: 1.89945  | Val Loss: 0.8072 Accuracy: 0.8862\n",
      "Epoch: 445/800 Train Loss: 0.5941 Accuracy: 0.9595 Time: 1.86002  | Val Loss: 0.8101 Accuracy: 0.8849\n",
      "Epoch: 446/800 Train Loss: 0.5934 Accuracy: 0.9591 Time: 2.00248  | Val Loss: 0.8123 Accuracy: 0.8839\n",
      "Epoch: 447/800 Train Loss: 0.5955 Accuracy: 0.9577 Time: 1.90977  | Val Loss: 0.8156 Accuracy: 0.8804\n",
      "Epoch: 448/800 Train Loss: 0.5953 Accuracy: 0.9580 Time: 1.90769  | Val Loss: 0.8030 Accuracy: 0.8856\n",
      "Epoch: 449/800 Train Loss: 0.5979 Accuracy: 0.9576 Time: 1.91572  | Val Loss: 0.8077 Accuracy: 0.8841\n",
      "Epoch: 450/800 Train Loss: 0.5914 Accuracy: 0.9596 Time: 1.84876  | Val Loss: 0.8139 Accuracy: 0.8831\n",
      "Epoch: 451/800 Train Loss: 0.5937 Accuracy: 0.9581 Time: 1.88584  | Val Loss: 0.8145 Accuracy: 0.8827\n",
      "Epoch: 452/800 Train Loss: 0.5912 Accuracy: 0.9596 Time: 1.93406  | Val Loss: 0.8073 Accuracy: 0.8855\n",
      "Epoch: 453/800 Train Loss: 0.5928 Accuracy: 0.9592 Time: 1.94076  | Val Loss: 0.8026 Accuracy: 0.8872\n",
      "Epoch: 454/800 Train Loss: 0.5936 Accuracy: 0.9588 Time: 1.98428  | Val Loss: 0.8115 Accuracy: 0.8834\n",
      "Epoch: 455/800 Train Loss: 0.5922 Accuracy: 0.9593 Time: 1.85010  | Val Loss: 0.8054 Accuracy: 0.8870\n",
      "Epoch: 456/800 Train Loss: 0.5924 Accuracy: 0.9587 Time: 1.92522  | Val Loss: 0.8069 Accuracy: 0.8887\n",
      "Epoch: 457/800 Train Loss: 0.5925 Accuracy: 0.9593 Time: 1.97287  | Val Loss: 0.8105 Accuracy: 0.8854\n",
      "Epoch: 458/800 Train Loss: 0.5913 Accuracy: 0.9602 Time: 1.85921  | Val Loss: 0.8158 Accuracy: 0.8823\n",
      "Epoch: 459/800 Train Loss: 0.5917 Accuracy: 0.9596 Time: 1.91012  | Val Loss: 0.8178 Accuracy: 0.8827\n",
      "Epoch: 460/800 Train Loss: 0.5924 Accuracy: 0.9595 Time: 1.94392  | Val Loss: 0.8119 Accuracy: 0.8823\n",
      "Epoch: 461/800 Train Loss: 0.5916 Accuracy: 0.9595 Time: 1.93688  | Val Loss: 0.8065 Accuracy: 0.8865\n",
      "Epoch: 462/800 Train Loss: 0.5947 Accuracy: 0.9579 Time: 1.90251  | Val Loss: 0.8186 Accuracy: 0.8828\n",
      "Epoch: 463/800 Train Loss: 0.5915 Accuracy: 0.9598 Time: 1.91422  | Val Loss: 0.8062 Accuracy: 0.8866\n",
      "Epoch: 464/800 Train Loss: 0.5891 Accuracy: 0.9607 Time: 1.89598  | Val Loss: 0.8102 Accuracy: 0.8839\n",
      "Epoch: 465/800 Train Loss: 0.5904 Accuracy: 0.9599 Time: 1.96797  | Val Loss: 0.8146 Accuracy: 0.8868\n",
      "Epoch: 466/800 Train Loss: 0.5908 Accuracy: 0.9601 Time: 1.86649  | Val Loss: 0.8130 Accuracy: 0.8855\n",
      "Epoch: 467/800 Train Loss: 0.5920 Accuracy: 0.9592 Time: 1.90805  | Val Loss: 0.8117 Accuracy: 0.8828\n",
      "Epoch: 468/800 Train Loss: 0.5884 Accuracy: 0.9612 Time: 1.85101  | Val Loss: 0.8022 Accuracy: 0.8866\n",
      "Epoch: 469/800 Train Loss: 0.5878 Accuracy: 0.9617 Time: 1.89181  | Val Loss: 0.8150 Accuracy: 0.8839\n",
      "Epoch: 470/800 Train Loss: 0.5920 Accuracy: 0.9589 Time: 1.87041  | Val Loss: 0.8118 Accuracy: 0.8845\n",
      "Epoch: 471/800 Train Loss: 0.5901 Accuracy: 0.9603 Time: 1.87670  | Val Loss: 0.8141 Accuracy: 0.8850\n",
      "Epoch: 472/800 Train Loss: 0.5897 Accuracy: 0.9599 Time: 1.85112  | Val Loss: 0.8129 Accuracy: 0.8840\n",
      "Epoch: 473/800 Train Loss: 0.5915 Accuracy: 0.9591 Time: 1.88481  | Val Loss: 0.8150 Accuracy: 0.8818\n",
      "Epoch: 474/800 Train Loss: 0.5892 Accuracy: 0.9595 Time: 1.87172  | Val Loss: 0.8128 Accuracy: 0.8847\n",
      "Epoch: 475/800 Train Loss: 0.5915 Accuracy: 0.9595 Time: 1.91418  | Val Loss: 0.8131 Accuracy: 0.8866\n",
      "Epoch: 476/800 Train Loss: 0.5891 Accuracy: 0.9609 Time: 1.87549  | Val Loss: 0.8196 Accuracy: 0.8852\n",
      "Epoch: 477/800 Train Loss: 0.5875 Accuracy: 0.9617 Time: 1.90858  | Val Loss: 0.8048 Accuracy: 0.8867\n",
      "Epoch: 478/800 Train Loss: 0.5895 Accuracy: 0.9605 Time: 1.96608  | Val Loss: 0.8135 Accuracy: 0.8823\n",
      "Epoch: 479/800 Train Loss: 0.5878 Accuracy: 0.9611 Time: 2.00720  | Val Loss: 0.8158 Accuracy: 0.8832\n",
      "Epoch: 480/800 Train Loss: 0.5882 Accuracy: 0.9613 Time: 1.87788  | Val Loss: 0.8100 Accuracy: 0.8833\n",
      "Epoch: 481/800 Train Loss: 0.5883 Accuracy: 0.9610 Time: 1.88687  | Val Loss: 0.8180 Accuracy: 0.8838\n",
      "Epoch: 482/800 Train Loss: 0.5890 Accuracy: 0.9615 Time: 1.96549  | Val Loss: 0.8214 Accuracy: 0.8815\n",
      "Epoch: 483/800 Train Loss: 0.5894 Accuracy: 0.9599 Time: 1.89180  | Val Loss: 0.8180 Accuracy: 0.8831\n",
      "Epoch: 484/800 Train Loss: 0.5905 Accuracy: 0.9602 Time: 1.88994  | Val Loss: 0.8066 Accuracy: 0.8866\n",
      "Epoch: 485/800 Train Loss: 0.5896 Accuracy: 0.9607 Time: 1.84508  | Val Loss: 0.8121 Accuracy: 0.8820\n",
      "Epoch: 486/800 Train Loss: 0.5879 Accuracy: 0.9622 Time: 1.85014  | Val Loss: 0.8112 Accuracy: 0.8839\n",
      "Epoch: 487/800 Train Loss: 0.5865 Accuracy: 0.9612 Time: 1.89874  | Val Loss: 0.8137 Accuracy: 0.8829\n",
      "Epoch: 488/800 Train Loss: 0.5871 Accuracy: 0.9612 Time: 1.87640  | Val Loss: 0.8102 Accuracy: 0.8813\n",
      "Epoch: 489/800 Train Loss: 0.5868 Accuracy: 0.9626 Time: 1.92351  | Val Loss: 0.8044 Accuracy: 0.8841\n",
      "Epoch: 490/800 Train Loss: 0.5877 Accuracy: 0.9608 Time: 1.92276  | Val Loss: 0.8164 Accuracy: 0.8847\n",
      "Epoch: 491/800 Train Loss: 0.5873 Accuracy: 0.9629 Time: 1.88601  | Val Loss: 0.8089 Accuracy: 0.8855\n",
      "Epoch: 492/800 Train Loss: 0.5869 Accuracy: 0.9616 Time: 1.87278  | Val Loss: 0.8159 Accuracy: 0.8856\n",
      "Epoch: 493/800 Train Loss: 0.5874 Accuracy: 0.9613 Time: 1.86597  | Val Loss: 0.8019 Accuracy: 0.8850\n",
      "Epoch: 494/800 Train Loss: 0.5859 Accuracy: 0.9625 Time: 1.91183  | Val Loss: 0.8070 Accuracy: 0.8840\n",
      "Epoch: 495/800 Train Loss: 0.5895 Accuracy: 0.9600 Time: 1.90116  | Val Loss: 0.8105 Accuracy: 0.8844\n",
      "Epoch: 496/800 Train Loss: 0.5898 Accuracy: 0.9600 Time: 1.87266  | Val Loss: 0.7976 Accuracy: 0.8869\n",
      "Epoch: 497/800 Train Loss: 0.5854 Accuracy: 0.9633 Time: 1.93342  | Val Loss: 0.8068 Accuracy: 0.8855\n",
      "Epoch: 498/800 Train Loss: 0.5857 Accuracy: 0.9627 Time: 1.90057  | Val Loss: 0.7980 Accuracy: 0.8875\n",
      "Epoch: 499/800 Train Loss: 0.5882 Accuracy: 0.9609 Time: 1.92156  | Val Loss: 0.8077 Accuracy: 0.8859\n",
      "Epoch: 500/800 Train Loss: 0.5872 Accuracy: 0.9613 Time: 1.86023  | Val Loss: 0.8091 Accuracy: 0.8863\n",
      "Epoch: 501/800 Train Loss: 0.5854 Accuracy: 0.9621 Time: 1.89790  | Val Loss: 0.8113 Accuracy: 0.8844\n",
      "Epoch: 502/800 Train Loss: 0.5836 Accuracy: 0.9632 Time: 1.89587  | Val Loss: 0.8101 Accuracy: 0.8850\n",
      "Epoch: 503/800 Train Loss: 0.5846 Accuracy: 0.9631 Time: 1.96718  | Val Loss: 0.8086 Accuracy: 0.8860\n",
      "Epoch: 504/800 Train Loss: 0.5840 Accuracy: 0.9632 Time: 1.87046  | Val Loss: 0.8052 Accuracy: 0.8872\n",
      "Epoch: 505/800 Train Loss: 0.5818 Accuracy: 0.9641 Time: 1.84522  | Val Loss: 0.8127 Accuracy: 0.8856\n",
      "Epoch: 506/800 Train Loss: 0.5866 Accuracy: 0.9621 Time: 1.86192  | Val Loss: 0.8095 Accuracy: 0.8827\n",
      "Epoch: 507/800 Train Loss: 0.5842 Accuracy: 0.9627 Time: 1.93636  | Val Loss: 0.8080 Accuracy: 0.8868\n",
      "Epoch: 508/800 Train Loss: 0.5845 Accuracy: 0.9626 Time: 1.87494  | Val Loss: 0.8056 Accuracy: 0.8858\n",
      "Epoch: 509/800 Train Loss: 0.5839 Accuracy: 0.9637 Time: 1.91572  | Val Loss: 0.8014 Accuracy: 0.8877\n",
      "Epoch: 510/800 Train Loss: 0.5821 Accuracy: 0.9637 Time: 1.91984  | Val Loss: 0.8041 Accuracy: 0.8876\n",
      "Epoch: 511/800 Train Loss: 0.5862 Accuracy: 0.9623 Time: 1.86715  | Val Loss: 0.8103 Accuracy: 0.8833\n",
      "Epoch: 512/800 Train Loss: 0.5847 Accuracy: 0.9622 Time: 1.89939  | Val Loss: 0.8104 Accuracy: 0.8865\n",
      "Epoch: 513/800 Train Loss: 0.5841 Accuracy: 0.9634 Time: 1.87710  | Val Loss: 0.8068 Accuracy: 0.8831\n",
      "Epoch: 514/800 Train Loss: 0.5839 Accuracy: 0.9626 Time: 1.84273  | Val Loss: 0.8094 Accuracy: 0.8830\n",
      "Epoch: 515/800 Train Loss: 0.5829 Accuracy: 0.9639 Time: 1.95920  | Val Loss: 0.8074 Accuracy: 0.8856\n",
      "Epoch: 516/800 Train Loss: 0.5830 Accuracy: 0.9638 Time: 1.88818  | Val Loss: 0.8077 Accuracy: 0.8840\n",
      "Epoch: 517/800 Train Loss: 0.5820 Accuracy: 0.9636 Time: 1.84109  | Val Loss: 0.8051 Accuracy: 0.8852\n",
      "Epoch: 518/800 Train Loss: 0.5842 Accuracy: 0.9627 Time: 1.84505  | Val Loss: 0.8079 Accuracy: 0.8841\n",
      "Epoch: 519/800 Train Loss: 0.5824 Accuracy: 0.9634 Time: 1.90322  | Val Loss: 0.8065 Accuracy: 0.8853\n",
      "Epoch: 520/800 Train Loss: 0.5842 Accuracy: 0.9625 Time: 2.02189  | Val Loss: 0.8062 Accuracy: 0.8853\n",
      "Epoch: 521/800 Train Loss: 0.5816 Accuracy: 0.9645 Time: 1.89793  | Val Loss: 0.8025 Accuracy: 0.8869\n",
      "Epoch: 522/800 Train Loss: 0.5830 Accuracy: 0.9634 Time: 1.87144  | Val Loss: 0.8014 Accuracy: 0.8861\n",
      "Epoch: 523/800 Train Loss: 0.5857 Accuracy: 0.9625 Time: 1.93625  | Val Loss: 0.8034 Accuracy: 0.8857\n",
      "Epoch: 524/800 Train Loss: 0.5814 Accuracy: 0.9646 Time: 1.94583  | Val Loss: 0.8036 Accuracy: 0.8868\n",
      "Epoch: 525/800 Train Loss: 0.5842 Accuracy: 0.9632 Time: 1.88151  | Val Loss: 0.8064 Accuracy: 0.8852\n",
      "Epoch: 526/800 Train Loss: 0.5834 Accuracy: 0.9633 Time: 1.87998  | Val Loss: 0.8014 Accuracy: 0.8881\n",
      "Epoch: 527/800 Train Loss: 0.5834 Accuracy: 0.9630 Time: 1.89908  | Val Loss: 0.8080 Accuracy: 0.8856\n",
      "Epoch: 528/800 Train Loss: 0.5830 Accuracy: 0.9629 Time: 1.85473  | Val Loss: 0.8055 Accuracy: 0.8853\n",
      "Epoch: 529/800 Train Loss: 0.5796 Accuracy: 0.9656 Time: 1.92084  | Val Loss: 0.8020 Accuracy: 0.8878\n",
      "Epoch: 530/800 Train Loss: 0.5819 Accuracy: 0.9633 Time: 1.87774  | Val Loss: 0.8021 Accuracy: 0.8894\n",
      "Epoch: 531/800 Train Loss: 0.5788 Accuracy: 0.9655 Time: 1.86967  | Val Loss: 0.8059 Accuracy: 0.8857\n",
      "Epoch: 532/800 Train Loss: 0.5822 Accuracy: 0.9639 Time: 1.84225  | Val Loss: 0.8015 Accuracy: 0.8862\n",
      "Epoch: 533/800 Train Loss: 0.5797 Accuracy: 0.9638 Time: 1.99218  | Val Loss: 0.8126 Accuracy: 0.8872\n",
      "Epoch: 534/800 Train Loss: 0.5813 Accuracy: 0.9643 Time: 1.89817  | Val Loss: 0.8099 Accuracy: 0.8857\n",
      "Epoch: 535/800 Train Loss: 0.5832 Accuracy: 0.9637 Time: 1.91577  | Val Loss: 0.8015 Accuracy: 0.8889\n",
      "Epoch: 536/800 Train Loss: 0.5797 Accuracy: 0.9647 Time: 1.88700  | Val Loss: 0.8124 Accuracy: 0.8860\n",
      "Epoch: 537/800 Train Loss: 0.5803 Accuracy: 0.9644 Time: 2.01081  | Val Loss: 0.8027 Accuracy: 0.8865\n",
      "Epoch: 538/800 Train Loss: 0.5831 Accuracy: 0.9637 Time: 1.85807  | Val Loss: 0.8013 Accuracy: 0.8871\n",
      "Epoch: 539/800 Train Loss: 0.5812 Accuracy: 0.9643 Time: 1.91593  | Val Loss: 0.8099 Accuracy: 0.8864\n",
      "Epoch: 540/800 Train Loss: 0.5804 Accuracy: 0.9646 Time: 1.85563  | Val Loss: 0.8025 Accuracy: 0.8891\n",
      "Epoch: 541/800 Train Loss: 0.5823 Accuracy: 0.9639 Time: 1.89368  | Val Loss: 0.8101 Accuracy: 0.8856\n",
      "Epoch: 542/800 Train Loss: 0.5815 Accuracy: 0.9640 Time: 1.87956  | Val Loss: 0.8038 Accuracy: 0.8878\n",
      "Epoch: 543/800 Train Loss: 0.5788 Accuracy: 0.9652 Time: 1.88622  | Val Loss: 0.8018 Accuracy: 0.8886\n",
      "Epoch: 544/800 Train Loss: 0.5769 Accuracy: 0.9664 Time: 1.90574  | Val Loss: 0.7992 Accuracy: 0.8871\n",
      "Epoch: 545/800 Train Loss: 0.5783 Accuracy: 0.9656 Time: 1.85581  | Val Loss: 0.7987 Accuracy: 0.8867\n",
      "Epoch: 546/800 Train Loss: 0.5823 Accuracy: 0.9633 Time: 1.89716  | Val Loss: 0.8004 Accuracy: 0.8890\n",
      "Epoch: 547/800 Train Loss: 0.5820 Accuracy: 0.9640 Time: 1.92918  | Val Loss: 0.7984 Accuracy: 0.8898\n",
      "Epoch: 548/800 Train Loss: 0.5801 Accuracy: 0.9647 Time: 1.85997  | Val Loss: 0.8051 Accuracy: 0.8870\n",
      "Epoch: 549/800 Train Loss: 0.5792 Accuracy: 0.9649 Time: 1.87524  | Val Loss: 0.8012 Accuracy: 0.8871\n",
      "Epoch: 550/800 Train Loss: 0.5809 Accuracy: 0.9642 Time: 1.92231  | Val Loss: 0.8010 Accuracy: 0.8874\n",
      "Epoch: 551/800 Train Loss: 0.5787 Accuracy: 0.9646 Time: 1.91587  | Val Loss: 0.8091 Accuracy: 0.8851\n",
      "Epoch: 552/800 Train Loss: 0.5788 Accuracy: 0.9651 Time: 1.85263  | Val Loss: 0.8021 Accuracy: 0.8887\n",
      "Epoch: 553/800 Train Loss: 0.5826 Accuracy: 0.9632 Time: 1.88324  | Val Loss: 0.8019 Accuracy: 0.8890\n",
      "Epoch: 554/800 Train Loss: 0.5800 Accuracy: 0.9643 Time: 1.86348  | Val Loss: 0.7997 Accuracy: 0.8900\n",
      "Epoch: 555/800 Train Loss: 0.5780 Accuracy: 0.9654 Time: 1.94037  | Val Loss: 0.7996 Accuracy: 0.8899\n",
      "Epoch: 556/800 Train Loss: 0.5793 Accuracy: 0.9648 Time: 1.95248  | Val Loss: 0.8036 Accuracy: 0.8878\n",
      "Epoch: 557/800 Train Loss: 0.5807 Accuracy: 0.9644 Time: 1.86605  | Val Loss: 0.7950 Accuracy: 0.8897\n",
      "Epoch: 558/800 Train Loss: 0.5775 Accuracy: 0.9654 Time: 1.87964  | Val Loss: 0.7989 Accuracy: 0.8882\n",
      "Epoch: 559/800 Train Loss: 0.5749 Accuracy: 0.9666 Time: 1.92095  | Val Loss: 0.8070 Accuracy: 0.8871\n",
      "Epoch: 560/800 Train Loss: 0.5753 Accuracy: 0.9669 Time: 1.88576  | Val Loss: 0.8022 Accuracy: 0.8881\n",
      "Epoch: 561/800 Train Loss: 0.5782 Accuracy: 0.9653 Time: 1.90572  | Val Loss: 0.8044 Accuracy: 0.8853\n",
      "Epoch: 562/800 Train Loss: 0.5757 Accuracy: 0.9664 Time: 1.97871  | Val Loss: 0.8030 Accuracy: 0.8899\n",
      "Epoch: 563/800 Train Loss: 0.5761 Accuracy: 0.9661 Time: 1.88500  | Val Loss: 0.8072 Accuracy: 0.8875\n",
      "Epoch: 564/800 Train Loss: 0.5780 Accuracy: 0.9658 Time: 1.89080  | Val Loss: 0.8058 Accuracy: 0.8882\n",
      "Epoch: 565/800 Train Loss: 0.5801 Accuracy: 0.9640 Time: 1.89382  | Val Loss: 0.7955 Accuracy: 0.8886\n",
      "Epoch: 566/800 Train Loss: 0.5763 Accuracy: 0.9662 Time: 1.89912  | Val Loss: 0.8024 Accuracy: 0.8885\n",
      "Epoch: 567/800 Train Loss: 0.5791 Accuracy: 0.9648 Time: 1.94044  | Val Loss: 0.7955 Accuracy: 0.8907\n",
      "Epoch: 568/800 Train Loss: 0.5781 Accuracy: 0.9651 Time: 1.92358  | Val Loss: 0.7947 Accuracy: 0.8896\n",
      "Epoch: 569/800 Train Loss: 0.5780 Accuracy: 0.9647 Time: 1.87009  | Val Loss: 0.7972 Accuracy: 0.8903\n",
      "Epoch: 570/800 Train Loss: 0.5773 Accuracy: 0.9657 Time: 1.88821  | Val Loss: 0.7991 Accuracy: 0.8858\n",
      "Epoch: 571/800 Train Loss: 0.5766 Accuracy: 0.9656 Time: 1.84740  | Val Loss: 0.7987 Accuracy: 0.8880\n",
      "Epoch: 572/800 Train Loss: 0.5762 Accuracy: 0.9667 Time: 1.98713  | Val Loss: 0.7995 Accuracy: 0.8892\n",
      "Epoch: 573/800 Train Loss: 0.5769 Accuracy: 0.9659 Time: 1.85391  | Val Loss: 0.7962 Accuracy: 0.8916\n",
      "Epoch: 574/800 Train Loss: 0.5781 Accuracy: 0.9657 Time: 1.87356  | Val Loss: 0.7998 Accuracy: 0.8891\n",
      "Epoch: 575/800 Train Loss: 0.5763 Accuracy: 0.9652 Time: 1.84783  | Val Loss: 0.7998 Accuracy: 0.8893\n",
      "Epoch: 576/800 Train Loss: 0.5766 Accuracy: 0.9660 Time: 1.88630  | Val Loss: 0.8001 Accuracy: 0.8903\n",
      "Epoch: 577/800 Train Loss: 0.5759 Accuracy: 0.9666 Time: 1.92455  | Val Loss: 0.7992 Accuracy: 0.8899\n",
      "Epoch: 578/800 Train Loss: 0.5745 Accuracy: 0.9675 Time: 1.98832  | Val Loss: 0.7984 Accuracy: 0.8916\n",
      "Epoch: 579/800 Train Loss: 0.5766 Accuracy: 0.9667 Time: 1.89858  | Val Loss: 0.8005 Accuracy: 0.8890\n",
      "Epoch: 580/800 Train Loss: 0.5744 Accuracy: 0.9669 Time: 1.90609  | Val Loss: 0.8010 Accuracy: 0.8900\n",
      "Epoch: 581/800 Train Loss: 0.5747 Accuracy: 0.9677 Time: 1.90413  | Val Loss: 0.8061 Accuracy: 0.8883\n",
      "Epoch: 582/800 Train Loss: 0.5722 Accuracy: 0.9685 Time: 1.92009  | Val Loss: 0.8041 Accuracy: 0.8897\n",
      "Epoch: 583/800 Train Loss: 0.5756 Accuracy: 0.9658 Time: 1.90881  | Val Loss: 0.7949 Accuracy: 0.8926\n",
      "Epoch: 584/800 Train Loss: 0.5740 Accuracy: 0.9674 Time: 1.98312  | Val Loss: 0.7961 Accuracy: 0.8911\n",
      "Epoch: 585/800 Train Loss: 0.5751 Accuracy: 0.9669 Time: 1.90420  | Val Loss: 0.7955 Accuracy: 0.8894\n",
      "Epoch: 586/800 Train Loss: 0.5727 Accuracy: 0.9681 Time: 1.90489  | Val Loss: 0.7998 Accuracy: 0.8901\n",
      "Epoch: 587/800 Train Loss: 0.5777 Accuracy: 0.9657 Time: 1.97827  | Val Loss: 0.8029 Accuracy: 0.8900\n",
      "Epoch: 588/800 Train Loss: 0.5759 Accuracy: 0.9665 Time: 2.03248  | Val Loss: 0.7993 Accuracy: 0.8899\n",
      "Epoch: 589/800 Train Loss: 0.5733 Accuracy: 0.9677 Time: 1.87563  | Val Loss: 0.8003 Accuracy: 0.8894\n",
      "Epoch: 590/800 Train Loss: 0.5763 Accuracy: 0.9655 Time: 1.96678  | Val Loss: 0.8071 Accuracy: 0.8885\n",
      "Epoch: 591/800 Train Loss: 0.5741 Accuracy: 0.9674 Time: 1.84539  | Val Loss: 0.7940 Accuracy: 0.8925\n",
      "Epoch: 592/800 Train Loss: 0.5716 Accuracy: 0.9681 Time: 1.88893  | Val Loss: 0.7970 Accuracy: 0.8912\n",
      "Epoch: 593/800 Train Loss: 0.5739 Accuracy: 0.9671 Time: 1.87234  | Val Loss: 0.8002 Accuracy: 0.8912\n",
      "Epoch: 594/800 Train Loss: 0.5750 Accuracy: 0.9668 Time: 1.89034  | Val Loss: 0.7981 Accuracy: 0.8895\n",
      "Epoch: 595/800 Train Loss: 0.5733 Accuracy: 0.9676 Time: 1.92806  | Val Loss: 0.8007 Accuracy: 0.8888\n",
      "Epoch: 596/800 Train Loss: 0.5729 Accuracy: 0.9675 Time: 1.84439  | Val Loss: 0.8007 Accuracy: 0.8898\n",
      "Epoch: 597/800 Train Loss: 0.5742 Accuracy: 0.9668 Time: 1.92927  | Val Loss: 0.8042 Accuracy: 0.8870\n",
      "Epoch: 598/800 Train Loss: 0.5747 Accuracy: 0.9670 Time: 1.89822  | Val Loss: 0.7982 Accuracy: 0.8878\n",
      "Epoch: 599/800 Train Loss: 0.5712 Accuracy: 0.9687 Time: 1.91248  | Val Loss: 0.7945 Accuracy: 0.8902\n",
      "Epoch: 600/800 Train Loss: 0.5716 Accuracy: 0.9685 Time: 1.91326  | Val Loss: 0.8002 Accuracy: 0.8878\n",
      "Epoch: 601/800 Train Loss: 0.5717 Accuracy: 0.9682 Time: 1.88262  | Val Loss: 0.7987 Accuracy: 0.8890\n",
      "Epoch: 602/800 Train Loss: 0.5740 Accuracy: 0.9676 Time: 1.85461  | Val Loss: 0.8010 Accuracy: 0.8861\n",
      "Epoch: 603/800 Train Loss: 0.5730 Accuracy: 0.9673 Time: 2.00071  | Val Loss: 0.7984 Accuracy: 0.8890\n",
      "Epoch: 604/800 Train Loss: 0.5728 Accuracy: 0.9678 Time: 1.96431  | Val Loss: 0.7978 Accuracy: 0.8885\n",
      "Epoch: 605/800 Train Loss: 0.5743 Accuracy: 0.9671 Time: 1.91668  | Val Loss: 0.8010 Accuracy: 0.8884\n",
      "Epoch: 606/800 Train Loss: 0.5714 Accuracy: 0.9680 Time: 1.92201  | Val Loss: 0.7993 Accuracy: 0.8907\n",
      "Epoch: 607/800 Train Loss: 0.5755 Accuracy: 0.9670 Time: 1.99401  | Val Loss: 0.7965 Accuracy: 0.8908\n",
      "Epoch: 608/800 Train Loss: 0.5715 Accuracy: 0.9678 Time: 1.90212  | Val Loss: 0.7985 Accuracy: 0.8906\n",
      "Epoch: 609/800 Train Loss: 0.5710 Accuracy: 0.9689 Time: 1.95817  | Val Loss: 0.8027 Accuracy: 0.8879\n",
      "Epoch: 610/800 Train Loss: 0.5725 Accuracy: 0.9685 Time: 1.84707  | Val Loss: 0.8008 Accuracy: 0.8892\n",
      "Epoch: 611/800 Train Loss: 0.5721 Accuracy: 0.9682 Time: 1.93684  | Val Loss: 0.7980 Accuracy: 0.8894\n",
      "Epoch: 612/800 Train Loss: 0.5706 Accuracy: 0.9691 Time: 1.99572  | Val Loss: 0.7996 Accuracy: 0.8901\n",
      "Epoch: 613/800 Train Loss: 0.5717 Accuracy: 0.9680 Time: 2.00828  | Val Loss: 0.7984 Accuracy: 0.8905\n",
      "Epoch: 614/800 Train Loss: 0.5722 Accuracy: 0.9685 Time: 1.96398  | Val Loss: 0.7991 Accuracy: 0.8896\n",
      "Epoch: 615/800 Train Loss: 0.5712 Accuracy: 0.9687 Time: 1.92252  | Val Loss: 0.8042 Accuracy: 0.8889\n",
      "Epoch: 616/800 Train Loss: 0.5719 Accuracy: 0.9686 Time: 1.86619  | Val Loss: 0.7995 Accuracy: 0.8895\n",
      "Epoch: 617/800 Train Loss: 0.5745 Accuracy: 0.9671 Time: 1.85813  | Val Loss: 0.7965 Accuracy: 0.8912\n",
      "Epoch: 618/800 Train Loss: 0.5723 Accuracy: 0.9683 Time: 1.89260  | Val Loss: 0.7988 Accuracy: 0.8879\n",
      "Epoch: 619/800 Train Loss: 0.5733 Accuracy: 0.9677 Time: 1.95439  | Val Loss: 0.7963 Accuracy: 0.8891\n",
      "Epoch: 620/800 Train Loss: 0.5739 Accuracy: 0.9678 Time: 1.95025  | Val Loss: 0.7983 Accuracy: 0.8890\n",
      "Epoch: 621/800 Train Loss: 0.5698 Accuracy: 0.9693 Time: 1.95175  | Val Loss: 0.7962 Accuracy: 0.8895\n",
      "Epoch: 622/800 Train Loss: 0.5711 Accuracy: 0.9686 Time: 2.00167  | Val Loss: 0.7955 Accuracy: 0.8913\n",
      "Epoch: 623/800 Train Loss: 0.5705 Accuracy: 0.9690 Time: 2.09479  | Val Loss: 0.7974 Accuracy: 0.8887\n",
      "Epoch: 624/800 Train Loss: 0.5718 Accuracy: 0.9688 Time: 1.88977  | Val Loss: 0.8018 Accuracy: 0.8868\n",
      "Epoch: 625/800 Train Loss: 0.5706 Accuracy: 0.9685 Time: 1.86790  | Val Loss: 0.8014 Accuracy: 0.8876\n",
      "Epoch: 626/800 Train Loss: 0.5701 Accuracy: 0.9693 Time: 1.88338  | Val Loss: 0.7972 Accuracy: 0.8881\n",
      "Epoch: 627/800 Train Loss: 0.5727 Accuracy: 0.9679 Time: 1.88025  | Val Loss: 0.8027 Accuracy: 0.8864\n",
      "Epoch: 628/800 Train Loss: 0.5703 Accuracy: 0.9691 Time: 1.92019  | Val Loss: 0.8032 Accuracy: 0.8871\n",
      "Epoch: 629/800 Train Loss: 0.5686 Accuracy: 0.9702 Time: 1.98708  | Val Loss: 0.8023 Accuracy: 0.8882\n",
      "Epoch: 630/800 Train Loss: 0.5696 Accuracy: 0.9695 Time: 1.90800  | Val Loss: 0.8043 Accuracy: 0.8861\n",
      "Epoch: 631/800 Train Loss: 0.5679 Accuracy: 0.9698 Time: 1.87048  | Val Loss: 0.8060 Accuracy: 0.8868\n",
      "Epoch: 632/800 Train Loss: 0.5728 Accuracy: 0.9678 Time: 1.93205  | Val Loss: 0.8025 Accuracy: 0.8887\n",
      "Epoch: 633/800 Train Loss: 0.5708 Accuracy: 0.9688 Time: 1.91092  | Val Loss: 0.8002 Accuracy: 0.8881\n",
      "Epoch: 634/800 Train Loss: 0.5728 Accuracy: 0.9672 Time: 1.90360  | Val Loss: 0.8015 Accuracy: 0.8880\n",
      "Epoch: 635/800 Train Loss: 0.5694 Accuracy: 0.9695 Time: 1.87460  | Val Loss: 0.8033 Accuracy: 0.8881\n",
      "Epoch: 636/800 Train Loss: 0.5712 Accuracy: 0.9688 Time: 1.88034  | Val Loss: 0.7977 Accuracy: 0.8890\n",
      "Epoch: 637/800 Train Loss: 0.5703 Accuracy: 0.9689 Time: 1.99730  | Val Loss: 0.7974 Accuracy: 0.8896\n",
      "Epoch: 638/800 Train Loss: 0.5713 Accuracy: 0.9682 Time: 1.89745  | Val Loss: 0.8003 Accuracy: 0.8880\n",
      "Epoch: 639/800 Train Loss: 0.5704 Accuracy: 0.9694 Time: 1.90806  | Val Loss: 0.8020 Accuracy: 0.8897\n",
      "Epoch: 640/800 Train Loss: 0.5683 Accuracy: 0.9697 Time: 1.86483  | Val Loss: 0.7988 Accuracy: 0.8883\n",
      "Epoch: 641/800 Train Loss: 0.5693 Accuracy: 0.9694 Time: 1.99570  | Val Loss: 0.7984 Accuracy: 0.8903\n",
      "Epoch: 642/800 Train Loss: 0.5713 Accuracy: 0.9678 Time: 1.99495  | Val Loss: 0.7983 Accuracy: 0.8883\n",
      "Epoch: 643/800 Train Loss: 0.5703 Accuracy: 0.9689 Time: 1.88488  | Val Loss: 0.8002 Accuracy: 0.8897\n",
      "Epoch: 644/800 Train Loss: 0.5713 Accuracy: 0.9682 Time: 1.88870  | Val Loss: 0.7974 Accuracy: 0.8880\n",
      "Epoch: 645/800 Train Loss: 0.5704 Accuracy: 0.9692 Time: 1.87602  | Val Loss: 0.7959 Accuracy: 0.8892\n",
      "Epoch: 646/800 Train Loss: 0.5704 Accuracy: 0.9687 Time: 1.85713  | Val Loss: 0.7985 Accuracy: 0.8897\n",
      "Epoch: 647/800 Train Loss: 0.5675 Accuracy: 0.9707 Time: 1.84882  | Val Loss: 0.7970 Accuracy: 0.8890\n",
      "Epoch: 648/800 Train Loss: 0.5689 Accuracy: 0.9690 Time: 1.88836  | Val Loss: 0.7975 Accuracy: 0.8897\n",
      "Epoch: 649/800 Train Loss: 0.5695 Accuracy: 0.9695 Time: 1.93200  | Val Loss: 0.7988 Accuracy: 0.8893\n",
      "Epoch: 650/800 Train Loss: 0.5722 Accuracy: 0.9679 Time: 1.87579  | Val Loss: 0.7970 Accuracy: 0.8895\n",
      "Epoch: 651/800 Train Loss: 0.5714 Accuracy: 0.9683 Time: 1.92191  | Val Loss: 0.7935 Accuracy: 0.8884\n",
      "Epoch: 652/800 Train Loss: 0.5690 Accuracy: 0.9696 Time: 1.88330  | Val Loss: 0.7978 Accuracy: 0.8888\n",
      "Epoch: 653/800 Train Loss: 0.5679 Accuracy: 0.9696 Time: 1.93849  | Val Loss: 0.8008 Accuracy: 0.8898\n",
      "Epoch: 654/800 Train Loss: 0.5699 Accuracy: 0.9694 Time: 1.89231  | Val Loss: 0.8002 Accuracy: 0.8881\n",
      "Epoch: 655/800 Train Loss: 0.5683 Accuracy: 0.9700 Time: 1.91219  | Val Loss: 0.7966 Accuracy: 0.8909\n",
      "Epoch: 656/800 Train Loss: 0.5685 Accuracy: 0.9695 Time: 1.86459  | Val Loss: 0.7999 Accuracy: 0.8904\n",
      "Epoch: 657/800 Train Loss: 0.5679 Accuracy: 0.9702 Time: 1.91433  | Val Loss: 0.7974 Accuracy: 0.8886\n",
      "Epoch: 658/800 Train Loss: 0.5708 Accuracy: 0.9692 Time: 1.92234  | Val Loss: 0.7972 Accuracy: 0.8896\n",
      "Epoch: 659/800 Train Loss: 0.5695 Accuracy: 0.9691 Time: 1.90091  | Val Loss: 0.7954 Accuracy: 0.8905\n",
      "Epoch: 660/800 Train Loss: 0.5679 Accuracy: 0.9696 Time: 1.84704  | Val Loss: 0.7944 Accuracy: 0.8905\n",
      "Epoch: 661/800 Train Loss: 0.5683 Accuracy: 0.9696 Time: 1.95310  | Val Loss: 0.7954 Accuracy: 0.8918\n",
      "Epoch: 662/800 Train Loss: 0.5672 Accuracy: 0.9700 Time: 1.84817  | Val Loss: 0.7962 Accuracy: 0.8900\n",
      "Epoch: 663/800 Train Loss: 0.5678 Accuracy: 0.9700 Time: 1.92464  | Val Loss: 0.7979 Accuracy: 0.8904\n",
      "Epoch: 664/800 Train Loss: 0.5682 Accuracy: 0.9699 Time: 1.97664  | Val Loss: 0.7976 Accuracy: 0.8896\n",
      "Epoch: 665/800 Train Loss: 0.5686 Accuracy: 0.9697 Time: 1.88092  | Val Loss: 0.7991 Accuracy: 0.8883\n",
      "Epoch: 666/800 Train Loss: 0.5662 Accuracy: 0.9710 Time: 1.89767  | Val Loss: 0.8006 Accuracy: 0.8896\n",
      "Epoch: 667/800 Train Loss: 0.5673 Accuracy: 0.9703 Time: 1.91581  | Val Loss: 0.7956 Accuracy: 0.8918\n",
      "Epoch: 668/800 Train Loss: 0.5712 Accuracy: 0.9685 Time: 1.89446  | Val Loss: 0.7985 Accuracy: 0.8909\n",
      "Epoch: 669/800 Train Loss: 0.5670 Accuracy: 0.9703 Time: 1.94206  | Val Loss: 0.8001 Accuracy: 0.8886\n",
      "Epoch: 670/800 Train Loss: 0.5671 Accuracy: 0.9705 Time: 1.90216  | Val Loss: 0.8043 Accuracy: 0.8878\n",
      "Epoch: 671/800 Train Loss: 0.5666 Accuracy: 0.9708 Time: 1.85430  | Val Loss: 0.7989 Accuracy: 0.8906\n",
      "Epoch: 672/800 Train Loss: 0.5678 Accuracy: 0.9698 Time: 1.88642  | Val Loss: 0.7971 Accuracy: 0.8902\n",
      "Epoch: 673/800 Train Loss: 0.5668 Accuracy: 0.9707 Time: 1.92292  | Val Loss: 0.8019 Accuracy: 0.8887\n",
      "Epoch: 674/800 Train Loss: 0.5650 Accuracy: 0.9712 Time: 1.91306  | Val Loss: 0.8001 Accuracy: 0.8882\n",
      "Epoch: 675/800 Train Loss: 0.5699 Accuracy: 0.9687 Time: 1.88087  | Val Loss: 0.8010 Accuracy: 0.8877\n",
      "Epoch: 676/800 Train Loss: 0.5652 Accuracy: 0.9716 Time: 1.85639  | Val Loss: 0.8012 Accuracy: 0.8884\n",
      "Epoch: 677/800 Train Loss: 0.5668 Accuracy: 0.9705 Time: 1.90390  | Val Loss: 0.8035 Accuracy: 0.8870\n",
      "Epoch: 678/800 Train Loss: 0.5660 Accuracy: 0.9702 Time: 1.92026  | Val Loss: 0.7985 Accuracy: 0.8893\n",
      "Epoch: 679/800 Train Loss: 0.5677 Accuracy: 0.9701 Time: 1.85136  | Val Loss: 0.7995 Accuracy: 0.8883\n",
      "Epoch: 680/800 Train Loss: 0.5687 Accuracy: 0.9700 Time: 1.88105  | Val Loss: 0.8004 Accuracy: 0.8877\n",
      "Epoch: 681/800 Train Loss: 0.5678 Accuracy: 0.9702 Time: 1.91114  | Val Loss: 0.7968 Accuracy: 0.8899\n",
      "Epoch: 682/800 Train Loss: 0.5679 Accuracy: 0.9703 Time: 1.94249  | Val Loss: 0.7966 Accuracy: 0.8897\n",
      "Epoch: 683/800 Train Loss: 0.5668 Accuracy: 0.9696 Time: 1.93502  | Val Loss: 0.7978 Accuracy: 0.8888\n",
      "Epoch: 684/800 Train Loss: 0.5680 Accuracy: 0.9707 Time: 1.91703  | Val Loss: 0.7962 Accuracy: 0.8906\n",
      "Epoch: 685/800 Train Loss: 0.5693 Accuracy: 0.9693 Time: 1.93757  | Val Loss: 0.7973 Accuracy: 0.8905\n",
      "Epoch: 686/800 Train Loss: 0.5658 Accuracy: 0.9711 Time: 1.85163  | Val Loss: 0.7972 Accuracy: 0.8897\n",
      "Epoch: 687/800 Train Loss: 0.5678 Accuracy: 0.9702 Time: 1.95140  | Val Loss: 0.7969 Accuracy: 0.8903\n",
      "Epoch: 688/800 Train Loss: 0.5647 Accuracy: 0.9711 Time: 1.95074  | Val Loss: 0.7977 Accuracy: 0.8905\n",
      "Epoch: 689/800 Train Loss: 0.5670 Accuracy: 0.9705 Time: 1.83584  | Val Loss: 0.7991 Accuracy: 0.8893\n",
      "Epoch: 690/800 Train Loss: 0.5665 Accuracy: 0.9713 Time: 1.86244  | Val Loss: 0.7970 Accuracy: 0.8911\n",
      "Epoch: 691/800 Train Loss: 0.5672 Accuracy: 0.9707 Time: 1.84093  | Val Loss: 0.7986 Accuracy: 0.8897\n",
      "Epoch: 692/800 Train Loss: 0.5630 Accuracy: 0.9715 Time: 1.87520  | Val Loss: 0.7974 Accuracy: 0.8902\n",
      "Epoch: 693/800 Train Loss: 0.5685 Accuracy: 0.9697 Time: 1.92071  | Val Loss: 0.7977 Accuracy: 0.8922\n",
      "Epoch: 694/800 Train Loss: 0.5667 Accuracy: 0.9710 Time: 1.95764  | Val Loss: 0.7963 Accuracy: 0.8909\n",
      "Epoch: 695/800 Train Loss: 0.5709 Accuracy: 0.9681 Time: 1.87249  | Val Loss: 0.7942 Accuracy: 0.8913\n",
      "Epoch: 696/800 Train Loss: 0.5656 Accuracy: 0.9715 Time: 1.96720  | Val Loss: 0.7940 Accuracy: 0.8917\n",
      "Epoch: 697/800 Train Loss: 0.5669 Accuracy: 0.9702 Time: 1.87745  | Val Loss: 0.7966 Accuracy: 0.8912\n",
      "Epoch: 698/800 Train Loss: 0.5692 Accuracy: 0.9695 Time: 1.91410  | Val Loss: 0.7933 Accuracy: 0.8913\n",
      "Epoch: 699/800 Train Loss: 0.5648 Accuracy: 0.9712 Time: 1.98393  | Val Loss: 0.7969 Accuracy: 0.8895\n",
      "Epoch: 700/800 Train Loss: 0.5667 Accuracy: 0.9700 Time: 1.88215  | Val Loss: 0.7973 Accuracy: 0.8911\n",
      "Epoch: 701/800 Train Loss: 0.5683 Accuracy: 0.9694 Time: 1.85672  | Val Loss: 0.7961 Accuracy: 0.8901\n",
      "Epoch: 702/800 Train Loss: 0.5630 Accuracy: 0.9730 Time: 1.88890  | Val Loss: 0.7969 Accuracy: 0.8917\n",
      "Epoch: 703/800 Train Loss: 0.5636 Accuracy: 0.9716 Time: 1.98960  | Val Loss: 0.7972 Accuracy: 0.8904\n",
      "Epoch: 704/800 Train Loss: 0.5681 Accuracy: 0.9699 Time: 1.90344  | Val Loss: 0.7959 Accuracy: 0.8895\n",
      "Epoch: 705/800 Train Loss: 0.5638 Accuracy: 0.9719 Time: 1.87436  | Val Loss: 0.7960 Accuracy: 0.8920\n",
      "Epoch: 706/800 Train Loss: 0.5643 Accuracy: 0.9714 Time: 1.88488  | Val Loss: 0.7977 Accuracy: 0.8902\n",
      "Epoch: 707/800 Train Loss: 0.5676 Accuracy: 0.9705 Time: 1.90541  | Val Loss: 0.7962 Accuracy: 0.8920\n",
      "Epoch: 708/800 Train Loss: 0.5678 Accuracy: 0.9704 Time: 1.94973  | Val Loss: 0.7957 Accuracy: 0.8923\n",
      "Epoch: 709/800 Train Loss: 0.5652 Accuracy: 0.9710 Time: 1.98148  | Val Loss: 0.7980 Accuracy: 0.8912\n",
      "Epoch: 710/800 Train Loss: 0.5660 Accuracy: 0.9707 Time: 1.89334  | Val Loss: 0.7969 Accuracy: 0.8919\n",
      "Epoch: 711/800 Train Loss: 0.5667 Accuracy: 0.9709 Time: 1.89392  | Val Loss: 0.7962 Accuracy: 0.8914\n",
      "Epoch: 712/800 Train Loss: 0.5658 Accuracy: 0.9711 Time: 1.88982  | Val Loss: 0.7973 Accuracy: 0.8899\n",
      "Epoch: 713/800 Train Loss: 0.5675 Accuracy: 0.9700 Time: 1.85318  | Val Loss: 0.7979 Accuracy: 0.8904\n",
      "Epoch: 714/800 Train Loss: 0.5687 Accuracy: 0.9695 Time: 1.90154  | Val Loss: 0.7963 Accuracy: 0.8910\n",
      "Epoch: 715/800 Train Loss: 0.5695 Accuracy: 0.9690 Time: 1.88100  | Val Loss: 0.7965 Accuracy: 0.8902\n",
      "Epoch: 716/800 Train Loss: 0.5660 Accuracy: 0.9710 Time: 1.88185  | Val Loss: 0.7967 Accuracy: 0.8902\n",
      "Epoch: 717/800 Train Loss: 0.5657 Accuracy: 0.9706 Time: 1.85414  | Val Loss: 0.7968 Accuracy: 0.8904\n",
      "Epoch: 718/800 Train Loss: 0.5618 Accuracy: 0.9729 Time: 1.95351  | Val Loss: 0.7977 Accuracy: 0.8911\n",
      "Epoch: 719/800 Train Loss: 0.5626 Accuracy: 0.9725 Time: 1.90785  | Val Loss: 0.7974 Accuracy: 0.8904\n",
      "Epoch: 720/800 Train Loss: 0.5660 Accuracy: 0.9707 Time: 1.91166  | Val Loss: 0.7976 Accuracy: 0.8893\n",
      "Epoch: 721/800 Train Loss: 0.5658 Accuracy: 0.9706 Time: 1.88441  | Val Loss: 0.7984 Accuracy: 0.8903\n",
      "Epoch: 722/800 Train Loss: 0.5634 Accuracy: 0.9719 Time: 1.88189  | Val Loss: 0.7975 Accuracy: 0.8901\n",
      "Epoch: 723/800 Train Loss: 0.5651 Accuracy: 0.9709 Time: 1.87372  | Val Loss: 0.7973 Accuracy: 0.8893\n",
      "Epoch: 724/800 Train Loss: 0.5643 Accuracy: 0.9717 Time: 1.88966  | Val Loss: 0.7979 Accuracy: 0.8898\n",
      "Epoch: 725/800 Train Loss: 0.5652 Accuracy: 0.9713 Time: 1.94243  | Val Loss: 0.7963 Accuracy: 0.8905\n",
      "Epoch: 726/800 Train Loss: 0.5668 Accuracy: 0.9703 Time: 1.88047  | Val Loss: 0.7971 Accuracy: 0.8901\n",
      "Epoch: 727/800 Train Loss: 0.5652 Accuracy: 0.9709 Time: 1.91273  | Val Loss: 0.7971 Accuracy: 0.8888\n",
      "Epoch: 728/800 Train Loss: 0.5659 Accuracy: 0.9711 Time: 1.96540  | Val Loss: 0.7969 Accuracy: 0.8900\n",
      "Epoch: 729/800 Train Loss: 0.5634 Accuracy: 0.9722 Time: 1.85947  | Val Loss: 0.7970 Accuracy: 0.8907\n",
      "Epoch: 730/800 Train Loss: 0.5653 Accuracy: 0.9712 Time: 1.86256  | Val Loss: 0.7969 Accuracy: 0.8904\n",
      "Epoch: 731/800 Train Loss: 0.5668 Accuracy: 0.9702 Time: 1.96024  | Val Loss: 0.7963 Accuracy: 0.8908\n",
      "Epoch: 732/800 Train Loss: 0.5642 Accuracy: 0.9720 Time: 1.92126  | Val Loss: 0.7948 Accuracy: 0.8902\n",
      "Epoch: 733/800 Train Loss: 0.5643 Accuracy: 0.9717 Time: 1.86838  | Val Loss: 0.7952 Accuracy: 0.8897\n",
      "Epoch: 734/800 Train Loss: 0.5649 Accuracy: 0.9716 Time: 1.91799  | Val Loss: 0.7959 Accuracy: 0.8899\n",
      "Epoch: 735/800 Train Loss: 0.5655 Accuracy: 0.9710 Time: 1.98147  | Val Loss: 0.7952 Accuracy: 0.8908\n",
      "Epoch: 736/800 Train Loss: 0.5633 Accuracy: 0.9730 Time: 1.94882  | Val Loss: 0.7959 Accuracy: 0.8909\n",
      "Epoch: 737/800 Train Loss: 0.5646 Accuracy: 0.9712 Time: 1.89628  | Val Loss: 0.7953 Accuracy: 0.8906\n",
      "Epoch: 738/800 Train Loss: 0.5660 Accuracy: 0.9710 Time: 1.91928  | Val Loss: 0.7947 Accuracy: 0.8923\n",
      "Epoch: 739/800 Train Loss: 0.5615 Accuracy: 0.9725 Time: 2.00902  | Val Loss: 0.7941 Accuracy: 0.8912\n",
      "Epoch: 740/800 Train Loss: 0.5630 Accuracy: 0.9717 Time: 2.01957  | Val Loss: 0.7944 Accuracy: 0.8909\n",
      "Epoch: 741/800 Train Loss: 0.5634 Accuracy: 0.9716 Time: 1.87853  | Val Loss: 0.7955 Accuracy: 0.8910\n",
      "Epoch: 742/800 Train Loss: 0.5640 Accuracy: 0.9711 Time: 1.86444  | Val Loss: 0.7966 Accuracy: 0.8915\n",
      "Epoch: 743/800 Train Loss: 0.5656 Accuracy: 0.9711 Time: 1.86263  | Val Loss: 0.7948 Accuracy: 0.8910\n",
      "Epoch: 744/800 Train Loss: 0.5621 Accuracy: 0.9722 Time: 1.86613  | Val Loss: 0.7953 Accuracy: 0.8920\n",
      "Epoch: 745/800 Train Loss: 0.5666 Accuracy: 0.9708 Time: 1.87739  | Val Loss: 0.7956 Accuracy: 0.8919\n",
      "Epoch: 746/800 Train Loss: 0.5643 Accuracy: 0.9720 Time: 1.95731  | Val Loss: 0.7966 Accuracy: 0.8913\n",
      "Epoch: 747/800 Train Loss: 0.5633 Accuracy: 0.9722 Time: 1.85454  | Val Loss: 0.7958 Accuracy: 0.8913\n",
      "Epoch: 748/800 Train Loss: 0.5661 Accuracy: 0.9709 Time: 1.89884  | Val Loss: 0.7966 Accuracy: 0.8917\n",
      "Epoch: 749/800 Train Loss: 0.5635 Accuracy: 0.9717 Time: 1.89524  | Val Loss: 0.7958 Accuracy: 0.8914\n",
      "Epoch: 750/800 Train Loss: 0.5639 Accuracy: 0.9718 Time: 1.84844  | Val Loss: 0.7961 Accuracy: 0.8912\n",
      "Epoch: 751/800 Train Loss: 0.5654 Accuracy: 0.9714 Time: 1.90965  | Val Loss: 0.7958 Accuracy: 0.8918\n",
      "Epoch: 752/800 Train Loss: 0.5625 Accuracy: 0.9726 Time: 1.90151  | Val Loss: 0.7962 Accuracy: 0.8917\n",
      "Epoch: 753/800 Train Loss: 0.5627 Accuracy: 0.9717 Time: 1.86634  | Val Loss: 0.7957 Accuracy: 0.8915\n",
      "Epoch: 754/800 Train Loss: 0.5625 Accuracy: 0.9725 Time: 1.87133  | Val Loss: 0.7953 Accuracy: 0.8912\n",
      "Epoch: 755/800 Train Loss: 0.5652 Accuracy: 0.9708 Time: 1.85605  | Val Loss: 0.7950 Accuracy: 0.8919\n",
      "Epoch: 756/800 Train Loss: 0.5641 Accuracy: 0.9717 Time: 1.85940  | Val Loss: 0.7948 Accuracy: 0.8922\n",
      "Epoch: 757/800 Train Loss: 0.5658 Accuracy: 0.9710 Time: 1.88820  | Val Loss: 0.7950 Accuracy: 0.8916\n",
      "Epoch: 758/800 Train Loss: 0.5653 Accuracy: 0.9705 Time: 1.91444  | Val Loss: 0.7941 Accuracy: 0.8919\n",
      "Epoch: 759/800 Train Loss: 0.5625 Accuracy: 0.9721 Time: 1.98073  | Val Loss: 0.7949 Accuracy: 0.8915\n",
      "Epoch: 760/800 Train Loss: 0.5652 Accuracy: 0.9708 Time: 1.94067  | Val Loss: 0.7948 Accuracy: 0.8915\n",
      "Epoch: 761/800 Train Loss: 0.5656 Accuracy: 0.9717 Time: 1.89890  | Val Loss: 0.7949 Accuracy: 0.8921\n",
      "Epoch: 762/800 Train Loss: 0.5648 Accuracy: 0.9710 Time: 1.92085  | Val Loss: 0.7947 Accuracy: 0.8915\n",
      "Epoch: 763/800 Train Loss: 0.5648 Accuracy: 0.9711 Time: 1.94040  | Val Loss: 0.7947 Accuracy: 0.8916\n",
      "Epoch: 764/800 Train Loss: 0.5627 Accuracy: 0.9722 Time: 2.02426  | Val Loss: 0.7951 Accuracy: 0.8923\n",
      "Epoch: 765/800 Train Loss: 0.5644 Accuracy: 0.9715 Time: 1.85558  | Val Loss: 0.7957 Accuracy: 0.8913\n",
      "Epoch: 766/800 Train Loss: 0.5671 Accuracy: 0.9704 Time: 1.89875  | Val Loss: 0.7956 Accuracy: 0.8914\n",
      "Epoch: 767/800 Train Loss: 0.5622 Accuracy: 0.9721 Time: 1.84076  | Val Loss: 0.7950 Accuracy: 0.8913\n",
      "Epoch: 768/800 Train Loss: 0.5651 Accuracy: 0.9715 Time: 1.92971  | Val Loss: 0.7952 Accuracy: 0.8913\n",
      "Epoch: 769/800 Train Loss: 0.5658 Accuracy: 0.9707 Time: 1.92316  | Val Loss: 0.7956 Accuracy: 0.8913\n",
      "Epoch: 770/800 Train Loss: 0.5633 Accuracy: 0.9724 Time: 1.90190  | Val Loss: 0.7954 Accuracy: 0.8918\n",
      "Epoch: 771/800 Train Loss: 0.5638 Accuracy: 0.9718 Time: 1.90693  | Val Loss: 0.7954 Accuracy: 0.8923\n",
      "Epoch: 772/800 Train Loss: 0.5663 Accuracy: 0.9709 Time: 1.91167  | Val Loss: 0.7951 Accuracy: 0.8918\n",
      "Epoch: 773/800 Train Loss: 0.5650 Accuracy: 0.9711 Time: 1.90364  | Val Loss: 0.7951 Accuracy: 0.8916\n",
      "Epoch: 774/800 Train Loss: 0.5623 Accuracy: 0.9728 Time: 1.92278  | Val Loss: 0.7955 Accuracy: 0.8910\n",
      "Epoch: 775/800 Train Loss: 0.5641 Accuracy: 0.9717 Time: 1.93399  | Val Loss: 0.7949 Accuracy: 0.8914\n",
      "Epoch: 776/800 Train Loss: 0.5610 Accuracy: 0.9733 Time: 1.88344  | Val Loss: 0.7948 Accuracy: 0.8914\n",
      "Epoch: 777/800 Train Loss: 0.5636 Accuracy: 0.9716 Time: 1.85641  | Val Loss: 0.7946 Accuracy: 0.8913\n",
      "Epoch: 778/800 Train Loss: 0.5648 Accuracy: 0.9717 Time: 1.92636  | Val Loss: 0.7946 Accuracy: 0.8920\n",
      "Epoch: 779/800 Train Loss: 0.5637 Accuracy: 0.9719 Time: 1.86830  | Val Loss: 0.7947 Accuracy: 0.8926\n",
      "Epoch: 780/800 Train Loss: 0.5667 Accuracy: 0.9710 Time: 1.94519  | Val Loss: 0.7948 Accuracy: 0.8920\n",
      "Epoch: 781/800 Train Loss: 0.5669 Accuracy: 0.9708 Time: 1.87283  | Val Loss: 0.7949 Accuracy: 0.8919\n",
      "Epoch: 782/800 Train Loss: 0.5638 Accuracy: 0.9716 Time: 1.93832  | Val Loss: 0.7948 Accuracy: 0.8917\n",
      "Epoch: 783/800 Train Loss: 0.5646 Accuracy: 0.9716 Time: 1.89055  | Val Loss: 0.7948 Accuracy: 0.8916\n",
      "Epoch: 784/800 Train Loss: 0.5629 Accuracy: 0.9722 Time: 1.92066  | Val Loss: 0.7945 Accuracy: 0.8918\n",
      "Epoch: 785/800 Train Loss: 0.5621 Accuracy: 0.9727 Time: 1.93049  | Val Loss: 0.7947 Accuracy: 0.8915\n",
      "Epoch: 786/800 Train Loss: 0.5629 Accuracy: 0.9721 Time: 1.90856  | Val Loss: 0.7949 Accuracy: 0.8917\n",
      "Epoch: 787/800 Train Loss: 0.5619 Accuracy: 0.9731 Time: 1.90177  | Val Loss: 0.7948 Accuracy: 0.8918\n",
      "Epoch: 788/800 Train Loss: 0.5659 Accuracy: 0.9708 Time: 1.96026  | Val Loss: 0.7948 Accuracy: 0.8920\n",
      "Epoch: 789/800 Train Loss: 0.5625 Accuracy: 0.9723 Time: 1.86569  | Val Loss: 0.7950 Accuracy: 0.8920\n",
      "Epoch: 790/800 Train Loss: 0.5654 Accuracy: 0.9707 Time: 1.89617  | Val Loss: 0.7949 Accuracy: 0.8920\n",
      "Epoch: 791/800 Train Loss: 0.5644 Accuracy: 0.9714 Time: 1.85940  | Val Loss: 0.7949 Accuracy: 0.8919\n",
      "Epoch: 792/800 Train Loss: 0.5645 Accuracy: 0.9714 Time: 1.85681  | Val Loss: 0.7949 Accuracy: 0.8922\n",
      "Epoch: 793/800 Train Loss: 0.5639 Accuracy: 0.9718 Time: 1.86359  | Val Loss: 0.7948 Accuracy: 0.8922\n",
      "Epoch: 794/800 Train Loss: 0.5643 Accuracy: 0.9720 Time: 1.85210  | Val Loss: 0.7949 Accuracy: 0.8923\n",
      "Epoch: 795/800 Train Loss: 0.5647 Accuracy: 0.9710 Time: 1.86643  | Val Loss: 0.7950 Accuracy: 0.8920\n",
      "Epoch: 796/800 Train Loss: 0.5637 Accuracy: 0.9719 Time: 1.86065  | Val Loss: 0.7951 Accuracy: 0.8920\n",
      "Epoch: 797/800 Train Loss: 0.5657 Accuracy: 0.9706 Time: 1.95026  | Val Loss: 0.7950 Accuracy: 0.8920\n",
      "Epoch: 798/800 Train Loss: 0.5644 Accuracy: 0.9725 Time: 1.98373  | Val Loss: 0.7949 Accuracy: 0.8922\n",
      "Epoch: 799/800 Train Loss: 0.5646 Accuracy: 0.9722 Time: 1.92037  | Val Loss: 0.7948 Accuracy: 0.8920\n",
      "Epoch: 800/800 Train Loss: 0.5642 Accuracy: 0.9709 Time: 1.93376  | Val Loss: 0.7948 Accuracy: 0.8920\n"
     ]
    }
   ],
   "source": [
    "from hdd.train.warmup_scheduler import GradualWarmupScheduler\n",
    "from hdd.models.cnn.mlp_mixer import MLPMixer\n",
    "from hdd.train.classification_utils import naive_train_classification_model\n",
    "from hdd.models.nn_utils import count_trainable_parameter\n",
    "\n",
    "max_epochs = 800\n",
    "net = MLPMixer(\n",
    "    image_size=32,\n",
    "    channels=3,\n",
    "    patch_size=4,\n",
    "    dim=128,\n",
    "    depth=8,\n",
    "    num_classes=10,\n",
    ").to(DEVICE)\n",
    "print(f\"#Parameter: {count_trainable_parameter(net)}\")\n",
    "criteria = nn.CrossEntropyLoss(label_smoothing=0.1)\n",
    "optimizer = torch.optim.Adam(\n",
    "    net.parameters(),\n",
    "    lr=1e-3,\n",
    "    betas=(0.9, 0.99),\n",
    ")\n",
    "\n",
    "base_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(\n",
    "    optimizer, max_epochs, eta_min=1e-6\n",
    ")\n",
    "scheduler = GradualWarmupScheduler(\n",
    "    optimizer,\n",
    "    multiplier=1.0,\n",
    "    total_epoch=5,\n",
    "    after_scheduler=base_scheduler,\n",
    ")\n",
    "patch_4 = naive_train_classification_model(\n",
    "    net,\n",
    "    criteria,\n",
    "    max_epochs,\n",
    "    train_dataloader,\n",
    "    val_dataloader,\n",
    "    DEVICE,\n",
    "    optimizer,\n",
    "    scheduler,\n",
    "    verbose=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "78cffdf8",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "pytorch-cu124",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
